spatie / crawler

An easy to use, powerful crawler implemented in PHP. Can execute Javascript.

Home Page:https://freek.dev/308-building-a-crawler-in-php

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to modify the URL to be crawled?

mw108 opened this issue · comments

Is it possible to modify the URL, which will be crawled, for instance, add "index.php" at the end of the URL?

URL to be crawled: https://test.mysite.com/my-uri/
Modified URL that should be crawled instead: https://test.mysite.com/my-uri/index.php

Though I've never used it that way, I think you can modify the url that's being passed to willCrawl of a crawl observer. Check the docs to learn how to create a a custom crawl observer.

Could we look back at this? What you are passed is an instance of Psr\Http\Message\UriInterace, which is immutable and willCrawl doesn't return any values, so there's no easy way to modify the URL. This feature would be useful. Thanks.