yrxwin / birdspider

web scrawler for a birds website

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

birdspider

web scrawler for a birds website.

Description

This project is base on Scrapy.

Install

pip install Scrapy, virtualenv is highly recommended. If you encounter any problem, please refer to the official document of Scrapy Installation.

Usage

  1. cd to the root directory
  2. run scrapy runspider myspider.py in shell
  3. find the birds.json file in the same directory, all scrawled data are in it
  4. enjoy :)

Modification

Change target website and target scrawl pages by change this line.

About

web scrawler for a birds website

License:MIT License


Languages

Language:Python 100.0%