spekulatius / PHPScraper

A universal web-util for PHP.

Home Page:https://phpscraper.de

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Request] Add robots.txt parsing

joshua-bn opened this issue · comments

Would be nice to have the ability to parse robots.txt like RSS feeds. $web->robots

https://github.com/bopoda/robots-txt-parser is a library. Not sure if it is the one to use here but it seems to do the job

Yeah, that's something to consider. I would opt for https://github.com/spatie/robots-txt instead as it's better maintained. What exactly do you want to achieve with the information?

Personally, I am looking for sitemaps declared in robots.txt but I think there's also value in checking for rules for crawling.