The fastest web crawler and indexer. Foundational building blocks for data curation workloads.
- Concurrent
- Streaming
- Decentralization
- Headless Chrome Rendering
- HTTP Proxies
- Cron Jobs
- Subscriptions
- Smart Mode
- Blacklisting and Budgeting Depth
- Changelog
The simplest way to get started is to use the Spider Cloud for a pain free hosted service. View the spider or spider_cli directory for local installations. You can also use the spider with node.js using the spider-nodejs project.
See BENCHMARKS.
See EXAMPLES.
This project is licensed under the MIT license.
See CONTRIBUTING.