david.zyw's repositories
aws-sdk-java
The official AWS SDK for Java.
banzai-charts
Curated list of Banzai Cloud Helm charts used by the Pipeline Platform
bigdata-platform-on-k8s
deploy bigdata platform on kubernetes
cube-studio
Cloud native one-stop machine learning platform, Multi-user, Dataleap, Notebook, Drag-and-Drop pipeline, Multi-machine multi-gpu distributed training, Automl, Inference, Edge computing, Federation schedule, Real time training, large models, AIhub
datahub
The Metadata Platform for the Modern Data Stack
dinky
Dinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
docusaurus
Easy to maintain open source documentation websites.
flink
Apache Flink
flink-connector-elasticsearch
Apache Flink connector for ElasticSearch
flink-docker
Docker packaging for Apache Flink
flink-kubernetes-operator
Apache Flink Kubernetes Operator
hadoop
Apache Hadoop
iperf-jperf
Improvements to jperf, a Java interface to the iperf network throughput testing suite
presto-on-k8s
Deploying Presto on K8S as a cloud OLAP Serviceļ¼ dynamic scaling based on HPA
spark
Apache Spark - A unified analytics engine for large-scale data processing
spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
superset-2.1.0rc1
Apache Superset is a Data Visualization and Data Exploration Platform
talk-demos
Code & docs for Pipekit's talks
transporter
Sync data between persistence engines, like ETL only not stodgy