zhangyue19921010

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs for Scala, Java, Rust, Ruby, and Python.

Language:ScalaApache-2.0000

disruptor

High Performance Inter-Thread Messaging Library

Language:JavaApache-2.0000

druid

Apache Druid: a high performance real-time analytics database.

Language:JavaApache-2.0000

druid-arbitrary-granularity

Druid extension that allows custom time intervals for a query's granularity

Language:JavaMIT000

druid-operator-1

Apache Druid On Kubernetes

Language:GoNOASSERTION000

dt-sql-parser

SQL Parsers for BigData, built with antlr4.

Language:JavaScript000

flink

Apache Flink

Language:JavaApache-2.0000

flink-cdc-connectors

Change Data Capture (CDC) Connectors for Apache Flink

Language:JavaApache-2.0000

flink-learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Language:JavaApache-2.0000

flink-table-store

An Apache Flink subproject to provide storage for dynamic tables.

Language:JavaApache-2.0000

git

Git Source Code Mirror - This is a publish-only repository but pull requests can be turned into patches to the mailing list via GitGitGadget (https://gitgitgadget.github.io/). Please follow Documentation/SubmittingPatches procedure for any of your improvements.

Language:CNOASSERTION000

go-duckdb

go-duckdb provides a database/sql driver for the DuckDB database engine.

Language:C++MIT000

grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

Language:TypeScriptAGPL-3.0000

hudi

Upserts, Deletes And Incremental Processing on Big Data.

Language:JavaApache-2.0000

incubator-celeborn

Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.

Language:JavaApache-2.0000

incubator-kyuubi

Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark

Language:ScalaApache-2.0000

LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Language:ScalaApache-2.0000

mapdb

MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.

Language:JavaApache-2.0000

rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.

Language:C++GPL-2.0000

rondb

This is RonDB, a distribution of NDB Cluster developed and used by Hopsworks AB. It also contains development branches of RonDB.

Language:C++NOASSERTION000

spark-1

A simple expressive web framework for java. Spark has a kotlin DSL https://github.com/perwendel/spark-kotlin

Language:JavaApache-2.0000

spark-sql-perf

Language:ScalaApache-2.0000

starrocks

StarRocks is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.

Language:JavaApache-2.0000

trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Language:JavaApache-2.0000