Kent Yao's repositories
spark-docker
Official Dockerfile for Apache Spark
aircompressor
A port of Snappy, LZO, LZ4, and Zstandard to Java
cloudberry
Cloudberry Database - Open source alternative to Greenplum Database. Created by the original Greenplum developers.
duckdb
DuckDB is an analytical in-process SQL database management system
grammars-v4
Grammars written for ANTLR v4; expectation that the grammars are free of actions.
hive
Apache Hive
incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
incubator-streampark
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
kafka
Mirror of Apache Kafka
mongo
The MongoDB Database
official-images
Primary source of truth for the Docker "Official Images" program
orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
parquet-format
Apache Parquet Format
parquet-mr
Apache Parquet
polaris
The interoperable, open source catalog for Apache Iceberg
postgres
Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch
spark-connect-go
Apache Spark Connect Client for Golang
spark-kubernetes-operator
Apache Spark Kubernetes Operator
spark-website
Apache Spark Website
trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
unitycatalog
Open, Multi-modal Catalog for Data & AI
zstd-jni
JNI binding for Zstd