wxl24life

followers

following

stars

Beijing

Drake Wang's repositories

fast-data-dev

Kafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, Landoop Tools, 20+ connectors

Language:ShellApache-2.0100

AthenaX

SQL-based streaming analytics platform at scale

Language:JavaApache-2.0010

ByConity

ByConity is an open source cloud-native data warehouse

Language:C++Apache-2.0000

cloudera-playbook

Cloudera deployment automation with Ansible

Language:HTMLApache-2.0010

delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Language:ScalaApache-2.0010

delta-sharing

An open protocol for secure data sharing

Language:ScalaApache-2.0010

docs.zh-cn

000

dolphinscheduler

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.

Language:JavaApache-2.0010

doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

Language:JavaApache-2.0000

facebook-hive-udfs

Facebook's Hive UDFs

Language:JavaApache-2.0010

impala

Apache Impala

Language:C++Apache-2.0000

starrocks

StarRocks is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.

Language:JavaApache-2.0000

fe-plugins-auditloader

000

flink-parquet-demo

A simple demo to use parquet format to write hdfs file.

Language:Java020

infoworld-post

Code examples for a blog post on infoworld.com

Language:JavaApache-2.0020

Java2Scala

Some demo code while playing with Java & Scala

Language:Java010

jdbook_crawler

craw jd.com book infomation

Language:Python020

kafka

Mirror of Apache Kafka

Language:JavaApache-2.0010

LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Language:ScalaApache-2.0000

medium-blog-kafka-udemy

Supporting repository for the blog post at https://medium.com/@stephane.maarek/how-to-use-apache-kafka-to-transform-a-batch-pipeline-into-a-real-time-one-831b48a6ad85

Language:Java020

nifi

Mirror of Apache NiFi

Language:JavaApache-2.0000

openai-cookbook

Examples and guides for using the OpenAI API

Language:Jupyter NotebookMIT000

openbilibili-go-common

🙈！🙉！🙊！我不清楚这些是啥… 想谈道德的请把出门右转996.icu！

Language:Go000

scala-labs-exercises

my standalone version of scala-labs project

Language:Scala010

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaApache-2.0000

SparkInternals

Notes talking about the design and implementation of Apache Spark

010

sql-training

Language:Shell010

starrocks-connector-for-apache-flink

Apache-2.0000

trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Apache-2.0000

wxl24life.github.io

Language:HTML020