Kyle Bendickson's repositories
spark-testing-base
Base classes to use when writing tests with Spark
airflow
Apache Airflow
airflow-provider-sample
A template repo for building and releasing Airflow provider packages.
astronomer-providers
Airflow Providers containing Deferrable Operators & Sensors from Astronomer
aws-mwaa-local-runner
This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.
bigtop
Mirror of Apache Bigtop
datahub
The Metadata Platform for the Modern Data Stack
dockerhub-description
A GitHub action to update a Docker Hub repository description from README.md
excalidraw
Virtual whiteboard for sketching hand-drawn like diagrams
flink-cdc-connectors
CDC Connectors for Apache Flink®
flink-faker
A data generator source connector for Flink SQL based on java-faker.
flink-playgrounds
Apache Flink Playgrounds
flink-table-store
An Apache Flink subproject to provide storage for dynamic tables.
flink-training
Apache Flink Training Excercises
gradle-revapi
Gradle plugin that uses Revapi to check whether you have introduced API/ABI breaks in your Java public API
iceberg-docs
Apache Iceberg Documentation
ngods
New generation opensource data stack
ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
raydp
RayDP: Distributed data processing library that provides simple APIs for running Spark on Ray and integrating Spark with distributed deep learning and machine learning frameworks.
smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
vscode-remote-try-java
Java sample project for trying out the VS Code Remote - Containers extension
zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
zetasql
ZetaSQL - Analyzer Framework for SQL