spatie / crawler

An easy to use, powerful crawler implemented in PHP. Can execute Javascript.

Home Page:https://freek.dev/308-building-a-crawler-in-php

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Honeypot

mitmelon opened this issue · comments

Can this crawler detect honeypots links?

No.
Can you explain to me how the this crawler extracts link from a page for crawling may be i could write a function to detect if the link extracted is a honeypot or not before the crawler crawls it... Or just show me the function that does that in the library

You get the content of a crawled link through the crawled method of an observer: https://github.com/spatie/crawler#usage

There, you can do with the $response whatever you want. You might find symfony's domcrawler component handy to get some nodes you are looking for in the html of the response.

symfony's domcrawler component

Thank you very much for the answer... Now i get the whole idea