Links with query parameters (GET parameters) are only crawled when present in the URL passed to startCrawling()
Defcon0 opened this issue · comments
Hello,
I have a website with paginations on it, i.e. I have pages where you can click on 1, 2 ... to get the next results of the list on it. Therefor a GET paremeter page=x is used.
Given the following situation:
- /mypage -> contains a link to /mypage2
- /mypage2 -> contains the paginated list with the items 1-3
- /mypage2?page=2 -> contains the paginated list with the items 4-6
If a pass /mypage2 to the crawler it finds and crawls the pagination links as well. If I pass /mypage, it finds /mypage2 but not /mypage2?page=2
Am I doing something wrong or is it intentional?
Thanks in advance!
Bye