xiaocui's repositories
cuijunyao.github.io
some useful notes in my person blog
CoolplaySpark
酷玩 Spark: Spark 源代码解析、Spark 类库等
dataiku-hive-udf
A collection of Hive UDFs
dockerbook-code
The code and configuration examples from The Docker Book (http://www.dockerbook.com)
draw.io
流程图仓库
druid
Column oriented distributed data store ideal for powering interactive applications
flink
Apache Flink
flinkStreamSQL
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join
homebrew
:beer: The missing package manager for OS X.
impyla
Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
incubator-zeppelin-druid
zeppelin repo fork to integrate druid interpreter
kafka-python
Python client for Apache Kafka
metabase
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
mongo-sql
An extensible SQL generation library for JavaScript with a focus on introspectibility
parquet-mr
As we have moved to Apache, please open your pull requests on: https://github.com/apache/parquet-mr
protobuf
Protocol Buffers - Google's data interchange format
pykafka
Kafka client for Python
SbtSubProjectsExample
An example of using sub-projects in a Scala/SBT project
scala-maven-plugin
The scala-maven-plugin (previously maven-scala-plugin) is used for compiling/testing/running/documenting scala code in maven.
scala-style-guide
Databricks Scala Coding Style Guide
spark-1
Mirror of Apache Spark
spark-hbase-connector
Connect Spark to HBase for reading and writing data with ease
spark-source-code-learn-note
spark learning note
SparkInternals
Notes talking about the design and implementation of Apache Spark
spray
A suite of scala libraries for building and consuming RESTful web services on top of Akka: lightweight, asynchronous, non-blocking, actor-based, testable
streaming-offset-to-zk
一个手动管理spark streaming集成kafka时的偏移量到zookeeper中的小项目
util
Wonderful reusable code from Twitter
work-notes-in-zhihu
Daily work summary about IT technology