YueZhang's repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
benchmark
A microbenchmark support library
build-your-own-x
Master programming by recreating your favorite technologies from scratch.
click
The "Command Line Interactive Controller for Kubernetes"
CS-Notes
:books: 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计
delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs for Scala, Java, Rust, Ruby, and Python.
disruptor
High Performance Inter-Thread Messaging Library
druid
Apache Druid: a high performance real-time analytics database.
druid-arbitrary-granularity
Druid extension that allows custom time intervals for a query's granularity
druid-operator-1
Apache Druid On Kubernetes
dt-sql-parser
SQL Parsers for BigData, built with antlr4.
flink
Apache Flink
flink-cdc-connectors
Change Data Capture (CDC) Connectors for Apache Flink
flink-learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
flink-table-store
An Apache Flink subproject to provide storage for dynamic tables.
git
Git Source Code Mirror - This is a publish-only repository but pull requests can be turned into patches to the mailing list via GitGitGadget (https://gitgitgadget.github.io/). Please follow Documentation/SubmittingPatches procedure for any of your improvements.
go-duckdb
go-duckdb provides a database/sql driver for the DuckDB database engine.
grafana
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
hudi
Upserts, Deletes And Incremental Processing on Big Data.
incubator-celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
mapdb
MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.
rocksdb
A library that provides an embeddable, persistent key-value store for fast storage.
rondb
This is RonDB, a distribution of NDB Cluster developed and used by Hopsworks AB. It also contains development branches of RonDB.
spark-1
A simple expressive web framework for java. Spark has a kotlin DSL https://github.com/perwendel/spark-kotlin
starrocks
StarRocks is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.
trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)