paragraff / puppeteerCrawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Just for fun package

Scrapers and antiscrapers war is never ended. More about it here. Why not to make fun on it?

Create env.json with parameters:

  • pathToBrowser - path to executable file. If you want to crawl from some specific browser
  • urlToCrowl - page url for crowling
  • pathToBrowserUserData - if you want to store browser user data
  • validationRequestUrl - if page has anti scraping defense and you want to know its protocol

Run script node index.js.

About


Languages

Language:JavaScript 100.0%