Beast code in Giters

handanchen's repositories

awesome-wechat-weapp

微信小程序开发资源汇总 :100:

Language:JavaScriptGPL-3.0000

canal

阿里巴巴mysql数据库binlog的增量订阅&消费组件

Language:JavaApache-2.0000

CDHExample

CDH集群环境Hdfs、MapReduce、Hive、Hbase、Kafka、Solr、Spark、Zookeeper、Mahout示例代码

Language:Java000

CDS

Content Data Store (HDFS/HBase)

Language:Java000

ChatBotCourse

自己动手做聊天机器人教程

Language:Python000

clouderasizer

Multipurpose tool for discovering and collecting Cloudera Manager metrics.

Language:Python000

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

Language:PythonBSD-3-Clause000

dw_etl

dw etl 工具 mysql 增量、全量抽取 to hive. 合并 hive 数据表, 等数据平台清洗工具

Language:Python000

FinancialNewsSearchEngine

Very simple search engine "specialised" in searching financial news (written using Nutch, Hbase, Solr, SpringBoot, Bootstrap and AngularJS)

Language:ShellApache-2.0000

hbase-increment-index

hbase+solr实现hbase的二级索引

Language:Java000

hbase-indexer

Lily HBase Indexer - indexing HBase, one row at a time

Language:JavaApache-2.0000

hive-third-functions

Some useful custom hive udf functions, especial array and json functions.

Language:Java000

kafka-example-in-scala

a kafka producer and consumer example in scala and java

Language:JavaApache-2.0000

kafka-offset-manager

Move Consumer offsets as you please

Language:Java000

kafka-simple-consumer

Language:Java000

kafkaLowLevelConsumer

kafka low level consumer api

Language:JavaApache-2.0000

KafkaProducerTool

对kafka自定义producer进行封装

Language:Java000

maxwell

Maxwell's daemon, a mysql-to-json kafka producer

Language:JavaNOASSERTION000

papers-we-love

Papers from the computer science community to read and discuss.

000

puppet-cdh

Puppet module for Hadoop and the rest of Cloudera's CDH 5.

Language:PuppetMIT000

puppet_repository

Language:Ruby000

reair

ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.

Language:JavaApache-2.0000

scrapyd

A service daemon to run Scrapy spiders

Language:PythonBSD-3-Clause000

show-me-the-code

Python 练习册，每天一个小程序

000

streamingpro

Build Spark Streaming Application by SQL

Language:JavaScript000

ThinkBayes

Code repository for Think Bayes.

Language:TeX000

wechat_sogou_crawl

基于搜狗微信的公众号文章爬虫

Language:Python000

wechat_spider

基于搜狗微信入口的微信爬虫程序。由基于phantomjs的python实现。使用了收费的动态代理。采集包括文章文本、阅读数、点赞数、评论以及评论赞数。效率：500公众号/小时。根据采集的公众号划分为多线程，可以实现并行采集。

Language:Python000

weixin

scrapy搜狗微信文章爬取

Language:Python000

yugong

阿里巴巴去Oracle数据迁移同步工具(全量+增量,目标支持MySQL/DRDS)

Language:JavaGPL-2.0000