weiboSearchCrawler

A distributed Sina Weibo Search spider based on Scrapy, Redis and MongoDB. And for the crawled page, extract user info, forward info and pictures and so on.

##Reference scrapy-redis

weibosearch

weibo_login

Installation

$ sudo apt-get install mongodb
$ sudo apt-get install redis-server
$ sudo apt-get install pymongo
$ sudo pip install -r requirements.txt

Usage

put your keywords in items.txt(just for test for me). Also, you can read keywords from mysql.
scrapy crawl weibosearch -a username=your_weibo_account -a password=your_weibo_password
you can test the process of parsing locally, see weibosearch/spiders/tests.py for more
add another spider with scrapy crawl weibosearch -a username=another_weibo_account -a password=another_weibo_password

=======

weiboSearchCrawler

About

A distributed Sina Weibo Search spider based on Scrapy, Redis and MongoDB. And for the crawled page, extract user info, forward info and pictures and so on.

Languages

Language:HTML 84.0%Language:Python 16.0%