WWW::RobotRules::Parser allows you to simply parse robots.txt files as described in http://www.robotstxt.org/wc/norobots.html. WWW: https://metacpan.org/release/WWW-RobotRules-Parser