QY's repositories
alluxio
Alluxio, formerly Tachyon, Unify Data at Memory Speed
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
griffin
Mirror of Apache griffin
grpc
The C based gRPC (C++, Node.js, Python, Ruby, Objective-C, PHP, C#)
hue
Hue is an open source Workbench for developing and accessing SQL/Data Apps.
jupyterlab
JupyterLab computational environment.
jupyterlab-git
A Git extension for JupyterLab
NJU-DisSys-2017
Distributed System, Fall 2017, CS@NJU
Personae
Personae is a repo of implements and enviorment of Deep Reinforcement Learning & Supervised Learning.
pyjnius
Access Java classes from Python
PyPandas
PyPandas, a data cleaning framework for Spark
smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
SmartFD
SmartFD: Efficient and Scalable Functional Dependency Discovery on Distributed Data-Parallel Platforms
spark-iforest
Isolation Forest on Spark
spark-lof
A parallel implementation of local outlier factor based on Spark
SparkInternals
Notes talking about the design and implementation of Apache Spark
sparklingpandas
Sparkling Pandas
WebSiteUseful
🍅 翻墙!科学上网,免费ss帐号分享、ssr订阅源,免费VPN下载,获取及使用教程请看:https://github.com/loremwalker/fq-book