Huaxin Gao's repositories
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
Language:RustApache-2.0000
arrow-datafusion-comet
Apache Arrow DataFusion Comet Spark Accelerator
Language:RustApache-2.0000
delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Language:ScalaApache-2.0000
hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
jvm-profiler
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Language:HTMLApache-2.0000
parquet-format
Apache Parquet
parquet-mr
Apache Parquet
scikit-learn
scikit-learn: machine learning in Python
Language:PythonNOASSERTION000
presto
Distributed SQL query engine for big data
Language:JavaApache-2.0000
spark
Mirror of Apache Spark
Language:ScalaApache-2.0000
spark-examples
official spark examples adapted for sbt
Apache-2.0000
spark-redshift
Spark and Redshift integration
spark-website
Apache Spark Website