neilxu6

neilxu's repositories

clothes

clothes with projects

Language:PHPNOASSERTION000

tiebaSpider

百度贴吧爬虫

Language:Python000

SUESInformationSharingPlatform

信息共享平台

100

IpSpider

爬取多个代理ip网站，并检查ip的可用性

Language:Python100

wechat_spider-1

微信公众号爬虫 (基于中间人攻击的爬虫核心实现,支持批量爬取公众号所有历史文章)

Language:Go000

tesseract

Tesseract Open Source OCR Engine (main repository)

Language:C++Apache-2.0000

SinaSpider

新浪微博爬虫（Scrapy、Redis）

Language:Python000

wechat-spider

微信公众号爬虫

Language:Python000

pydata-book

Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media

Language:Jupyter NotebookNOASSERTION000

wechat_spider

基于搜狗微信入口的微信爬虫程序。由基于phantomjs的python实现。使用了收费的动态代理。采集包括文章文本、阅读数、点赞数、评论以及评论赞数。效率：500公众号/小时。根据采集的公众号划分为多线程，可以实现并行采集。

Language:Python000

WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

Language:Python000

sundry_tools_demo

collect all kinds of tool-demos

Language:Python000

guazi_spider

web scraping with 瓜子二手车直卖网 url:https://www.guazi.com/sh/

Language:Python000

ganji_spider

web scraping with 赶集 url:http://sh.ganji.com/

Language:Python000

sundry_api_demos

collect all kinds of api-demos

Language:Python000

58tongcheng_spider

web scraping with 58同城 url:http://sh.58.com/

Language:Python000

mahout

Mirror of Apache Mahout

Language:JavaNOASSERTION000

lenskit

LensKit recommender toolkit.

Language:JavaLGPL-2.1000

zhihu_spider

知乎爬虫

Language:Python100

python-recsys

A python library for implementing a recommender system

Language:Python100

CnkiSpider

**知网爬虫

Language:Python000