Danny Chan (danny0405)

danny0405

Geek Repo

Company:OneHouse

Location:Hangzhou China

Home Page:http://yuzhao.site

Twitter:@danny_chan_

Github PK Tool:Github PK Tool

Danny Chan's starred repositories

tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/

Language:GoLicense:Apache-2.0Stargazers:36967Issues:1262Issues:18947

ClickHouse

ClickHouse® is a real-time analytics DBMS

Language:C++License:Apache-2.0Stargazers:36861Issues:691Issues:21246

leveldb

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

Language:C++License:BSD-3-ClauseStargazers:36238Issues:1312Issues:757

cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.

Language:GoLicense:NOASSERTIONStargazers:29941Issues:694Issues:65557

airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

Language:PythonLicense:NOASSERTIONStargazers:15594Issues:181Issues:14230

foundationdb

FoundationDB - the open source, distributed, transactional key-value store

Language:C++License:Apache-2.0Stargazers:14399Issues:293Issues:1731

arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

Language:C++License:Apache-2.0Stargazers:14330Issues:349Issues:25581

dolly

Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

Language:PythonLicense:Apache-2.0Stargazers:10811Issues:137Issues:162

debezium

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Language:JavaLicense:Apache-2.0Stargazers:10469Issues:217Issues:0

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Language:PythonLicense:Apache-2.0Stargazers:9673Issues:139Issues:5422

redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!

awesome-database-learning

A list of learning materials to understand databases internals

seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

Language:JavaLicense:Apache-2.0Stargazers:7846Issues:173Issues:3270

delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Language:ScalaLicense:Apache-2.0Stargazers:7462Issues:216Issues:1485

materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.

Language:RustLicense:NOASSERTIONStargazers:5728Issues:76Issues:2165

hudi

Upserts, Deletes And Incremental Processing on Big Data.

Language:JavaLicense:Apache-2.0Stargazers:5341Issues:1166Issues:3222

noria

Fast web applications through dynamic, partially-stateful dataflow

Language:RustLicense:Apache-2.0Stargazers:4987Issues:112Issues:79

YCSB

Yahoo! Cloud Serving Benchmark

Language:JavaLicense:Apache-2.0Stargazers:4923Issues:215Issues:952

calcite

Apache Calcite

Language:JavaLicense:Apache-2.0Stargazers:4556Issues:168Issues:0

timely-dataflow

A modular implementation of timely dataflow in Rust

Language:RustLicense:MITStargazers:3259Issues:86Issues:176

dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.

Language:JavaLicense:Apache-2.0Stargazers:3074Issues:38Issues:1408

differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.

Language:RustLicense:MITStargazers:2553Issues:49Issues:157

blog

Some notes on things I find interesting and important.

bitsail

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

Language:JavaLicense:Apache-2.0Stargazers:1619Issues:62Issues:212

sqlancer

Automated testing to find logic and performance bugs in database systems

Language:JavaLicense:MITStargazers:1473Issues:32Issues:139
Language:Jupyter NotebookLicense:MITStargazers:1028Issues:23Issues:25

amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.

Language:JavaLicense:Apache-2.0Stargazers:664Issues:29Issues:1181

rlink-rs

High-performance Stream Processing Framework. An alternative to Apache Flink.

Language:RustLicense:Apache-2.0Stargazers:418Issues:15Issues:2

morel

Standard ML interpreter, with relational extensions, implemented in Java

Language:JavaLicense:Apache-2.0Stargazers:294Issues:7Issues:138

mat-calcite-plugin

Heap query plugin for Eclipse Memory Analyzer

Language:JavaLicense:Apache-2.0Stargazers:153Issues:11Issues:22