Fu Chen's repositories
spark-rest-source
A Rest Api Structured Streaming DataSource
kungfu-panda
Kungfu Panda is a library for register python pandas UDFs in Spark SQL.
ammonite-spark
Run spark calculations from Ammonite
calcite
Mirror of Apache Calcite
canal
阿里巴巴mysql数据库binlog的增量订阅&消费组件 。阿里云DRDS( https://www.aliyun.com/product/drds )、阿里巴巴TDDL 二级索引、小表复制powerd by canal. Aliyun Data Lake Analytics https://www.aliyun.com/product/datalakeanalytics powered by canal
d2l-zh
《动手学深度学习》
davinci
Davinci is a DVaaS (Data Visualization as a Service) Platform
gluten
Gluten: Plugin to Double SparkSQL's Performance
incubator-celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
incubator-celeborn-website
Apache Celeborn Site
incubator-hudi
Upserts And Incremental Processing on Big Data
incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
incubator-kyuubi-website
Apache Kyuubi Site
jvm-profiler
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
koalas
Koalas: Pandas API on Apache Spark
mlflow
Open source platform for the machine learning lifecycle
raydp
RayDP: Distributed data processing library that provides simple APIs for running Spark on Ray and integrating Spark with distributed deep learning and machine learning frameworks.
spark
Mirror of Apache Spark
streamingpro
Build Spark Streaming Application by SQL
velox
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.