Nan Zhu's repositories
xgboost4j-spark-scalability
a benchmark to test scalability of xgboost4j-spark and relevant projects
Self-Learning-Notebooks
RLLearning
analytics-zoo
Distributed Tensorflow, Keras and BigDL on Apache Spark
arrow-datafusion
Apache Arrow DataFusion and Ballista query engines
celeborn-website
Apache Celeborn Site
cockroachdb-todo-apps
CockroachDB To-Do Apps
cockroachdb_playground
some programs to play around cockroachdb
ec2-selector-cli
the cli tool to select ec2 instances based on filters
frameless
Expressive types for Spark.
gazelle_plugin
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
github-markdown-toc
Easy TOC creation for GitHub README.md
how-query-engines-work
This is the companion repository for the book How Query Engines Work.
incubator-celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
incubator-sedona
A cluster computing framework for processing large-scale geospatial data
incubator-uniffle
Uniffle is a high performance, general purpose Remote Shuffle Service.
spark-lineage
Spark SQL listener to record lineage information
spark-sql-macros
Spark SQL Macros provides a mechanism similar to Spark User-Defined function registration; with the key enhancement being that custom code gets compiled to equivalent Catalyst Expressions at macro define time.
terraform-aws-eks-node-group
Terraform module to provision a fully managed AWS EKS Node Group
velox-intel
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.