Butao Zhang's repositories
arctic
Arctic is a streaming lake warehouse service open sourced by NetEase
benchto
Framework for running macro benchmarks in a clustered environment
calcite
Apache Calcite
datanucleus-rdbms-fork
DataNucleus support for persistence to RDBMS Datastores
debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
doris
Apache Doris is an easy-to-use, high performance and unified analytics database.
flink
Apache Flink
flink-cdc-connectors
CDC Connectors for Apache Flink®
hive
Apache Hive
iceberg
Apache Iceberg
spark
Apache Spark - A unified analytics engine for large-scale data processing
trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
gravitino
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
hadoop
Apache Hadoop
hbase
Apache HBase
helm
The Kubernetes Package Manager
hue
Hue Editor: Open source SQL Query Assistant for Databases/Warehouses
iceberg-docs
Apache Iceberg Documentation Site
incubator-seatunnel
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
incubator-wayang
Apache Wayang(incubating) is the first cross-platform data processing system.
kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
paimon
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
parquet-mr
Apache Parquet
presto
The official home of the Presto distributed SQL query engine for big data
presto-hive-apache
Shaded version of Apache Hive for Presto
ranger
Mirror of Apache Ranger
starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
tez
Mirror of Apache Tez
trino-hive-apache-1
Shaded version of Apache Hive for Trino