JunZhang's repositories
bigdata-examples
分享一些在工作中的大数据实战案例,包括flink、kafka、hadoop、presto等等。欢迎大家关注我的公众号【大数据技术与应用实战】,一起成长。
flink-learning
flink learning blog. http://www.54tianzhisheng.cn 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
awesome-anomaly-detection
A curated list of awesome anomaly detection resources
calcite
Apache Calcite
clickhouse-operator
The Altinity Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes
dagster
An orchestration platform for the development, production, and observation of data assets.
eagle
Real time data processing system based on flink and CEP
flink-cdc-connectors
CDC Connectors for Apache Flink®
flink-forward-china-2018
Flink Forward China 2018 Slides
flink-recommandSystem-demo
:helicopter::rocket:基于Flink实现的商品实时推荐系统。flink统计商品热度,放入redis缓存,分析日志信息,将画像标签和实时记录放入Hbase。在用户发起推荐请求后,根据用户画像重排序热度榜,并结合协同过滤和标签两个推荐模块为新生成的榜单的每一个产品添加关联产品,最后返回新的用户列表。
hudi
Upserts, Deletes And Incremental Processing on Big Data.
incubator-paimon
An Apache Flink subproject to provide storage for dynamic tables.
incubator-streampipes
Apache StreamPipes - A self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams.
incubator-streampipes-extensions
Apache StreamPipes - A self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams. This repository contains ready-to-use pipeline elements and adapters for StreamPipes Connect
java8-tutorial
Modern Java - A Guide to Java 8
metabase
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
nessie
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
QConShanghai2018
QCon上海2018 幻灯片
ranger
Mirror of Apache Ranger
streamx
Make Flink|Spark easier!!! The original intention of StreamX is to make the development of Flink easier. StreamX focuses on the management of development phases and tasks. Our ultimate goal is to build a one-stop big data solution integrating stream processing, batch processing, data warehouse and data laker.
trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.