Min Zhao (zhaomin1423)

zhaomin1423

Geek Repo

Location:Hangzhou, China

Github PK Tool:Github PK Tool


Organizations
apache

Min Zhao's repositories

incubator-iceberg

Apache Iceberg (Incubating)

Language:JavaLicense:Apache-2.0Stargazers:1Issues:0Issues:0

airbyte

Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.

Language:JavaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

arctic

Arctic is a streaming lake warehouse service open sourced by NetEase

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

flink

Apache Flink

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

flink-cdc-connectors

Change Data Capture (CDC) Connectors for Apache Flink

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

bitsail

BitSail is a distributed, high-performance data integration engine and provides global data integration solutions in batch, streaming, and incremental scenarios. At present, BitSail has been widely used and synchronizes hundreds of trillions data every day.

License:Apache-2.0Stargazers:0Issues:0Issues:0

blaze

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

License:Apache-2.0Stargazers:0Issues:0Issues:0

DataSphereStudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

License:Apache-2.0Stargazers:0Issues:0Issues:0

debezium

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs for Scala, Java, Rust, Ruby, and Python.

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

dolphinscheduler

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

elasticsearch-hadoop

:elephant: Elasticsearch real-time search and analytics natively integrated with Hadoop

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:ScalaStargazers:0Issues:0Issues:0

gravitino

A high-performance, geo-distributed and federated metadata lake

License:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-doris

Apache Doris (Incubating)

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-kyuubi-website

Apache Kyuubi Site

Language:HTMLLicense:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

License:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-livy

Mirror of Apache livy (Incubating)

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

incubator-paimon

Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-seatunnel

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

kyuubi

Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

kyuubi-client

Client libraries of end users of Apache Kyuubi

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenLineage

An Open Standard for lineage metadata collection

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

pulsar

Apache Pulsar - distributed pub-sub messaging system

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

spark-clickhouse-connector

Spark ClickHouse Connector build on DataSourceV2 API and gRPC protocol.

License:Apache-2.0Stargazers:0Issues:0Issues:0

spark-distcp

A re-implementation of Hadoop DistCP in Apache Spark

License:Apache-2.0Stargazers:0Issues:0Issues:0

spark-sql-dsv2-extension

A sql extension build on spark3 datasource v2 api, ex: hive v2 catalog support amoung multi clusters

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0