Amazon-Books-Crawler is a python web spyder developed with Scrapy framework. For now, it has only one spider, it scrapes python books from https://www.amazon.com search results. This spyder extracts book title, description, paperback_price, author, star_rate, reviews, img_url, img_path and stores results in a sqlite3 database. Also, data could be stored to a JSON or CSV file with a simple command -- scrapy crawl amazon - file.json. To read logs after crawling, read file named log.txt in main directory.
- Python3
- Scrapy
- SQLite3
- virtualenv -p python3 scrapy_books_spyder
- cd scrapy_books_spyder
- activate it (source bin/activate)
- git clone https://github.com/w-e-ll/scrapy-web-spyder.git
- cd scrapy-web-spyder
- pip install -r requirements.txt
- cd amazon
- scrapy crawl amazon
made by: https://w-e-ll.com