Webster
Overview
Webster is a reliable web crawling and scraping framework written with Node.js, used to crawl websites and extract structured data from their pages. Which is different from other crawling framework is that Webster can scrape the content which rendered by browser client side javascript and ajax request.
Docker quick start
pull the example docker image:
docker pull zhuyingda/webster-demo
docker run -it zhuyingda/webster-demo
here is a simple demo for crawler about Baidu search result web page:
node demo_producer.js && node demo_consumer.js
Requirements
- Node.js 8.x+, redis
- Works on Linux, Mac OSX
Or you can deploy on Docker.
Install
npm install webster
Documentation
You can see more details from here.
License
Copyright (c) 2017-present, Yingda (Sugar) Zhu