crwlrsoft / robots-txt

Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

crwlr.software logo

Robots Exclusion Standard/Protocol Parser

for Web Crawling/Scraping

Use this library within crawler/scraper programs to parse robots.txt files and check if your crawler user-agent is allowed to load certain paths.

Documentation

You can find the documentation at crwlr.software.

Contributing

If you consider contributing something to this package, read the contribution guide (CONTRIBUTING.md).

About

Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping

License:MIT License


Languages

Language:PHP 100.0%