handanchen's repositories
scrapy-examples
Multifarious scrapy examples.
VeryNginx
A very powerful and friendly nginx base on lua-nginx-module( openresty ) which provide custom waf , action and analyzing. 功能强大并且拥有对人类友好界面的Nginx, 提供防火墙,自定义行为和统计功能
solrj-example
solrj示例
Pythonspider
一个简单的python爬虫,原生python+BeautifulSoup
scrapy-dynamic-configurable
A dynamic configurable news crawler based Scrapy
e-business
电商爬虫系统:京东,当当,一号店,国美爬虫(代理使用)
jedis
A blazingly small and sane redis java client
bad-data-guide
An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
tornado
Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
hadoop
Mirror of Apache Hadoop
awesome-java-cn
Java资源大全中文版,包括开发库、开发工具、网站、博客、微信、微博等,由伯乐在线持续更新。
elite-proxies-scrapy-middleware
Elite Proxies (http://elite.proxies.online) middleware for scrapy http://rev.proxies.online
hive-sqoop-serde-tutorial
Howto use Hive-Sqoop-Serde to load Sqoop generated sequence files in Hive
distribute_crawler
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
django-redis
Full featured redis cache backend for Django.
QunarSpider
网络爬虫之Selenium使用代理登陆:爬取去哪儿网站
Hadoop-CombineFileInputFormat
Example implementation of hadoop CombineFileInputFormat
scrapy-proxies
Random proxy middleware for Scrapy
mysched
my scheduling system
qqwry-java
A java library to read QQWry IP database. (纯真IP地址数据库)
douappbook
Crawl book and rating infomations from Douban App
hdfs-file-slurper
Utility to easily copy files into HDFS
solrcloud
distribute solr deploy and management
cloud
云计算之hadoop、hive、hue、oozie、sqoop、hbase、zookeeper环境搭建及配置文件
seeyon
北京致远协创软件科技有限公司的一些记录。包括规范、文档、编程技巧等等。
hbase-solr-coprocessor
通过solr实现hbase二级索引,主要通过hbase的coprocessor的Observer实现。