Qian Xie's repositories
databricks-training-data-science-spark
TRAINING: DATA SCIENCE WITH APACHE SPARK 2.X
databricks-training-deeplearning-spark
Deep Learning, Keras, Tensorflow & Spark Training by Databricks
databricks-training-spark-tuning
TRAINING: APACHE SPARK TUNING AND BEST PRACTICES
drunken-data-quality
Spark package for checking data quality
git-flight-rules
Flight rules for git
imbalanced-learn
Python module to perform under sampling and over sampling with various techniques.
mxnet-the-straight-dope
An interactive book on deep learning. Much easy, so MXNet. Wow.
pandas-profiling
Create HTML profiling reports from pandas DataFrame objects
PySpark-Boilerplate
A boilerplate for writing PySpark Jobs
pyspark-example-project
Example project and best practices for Python-based Spark ETL jobs and applications.
pyspark-jupyter-cdh
Pyspark Jupyter Notebook on Cloudera CDH
pyspark-pictures
Learn the pyspark API through pictures and simple examples
pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
pytest-spark
pytest plugin to run the tests with support of pyspark
python-project-template
A template Python project with a focus on best practices.
scala_school
Lessons in the Fundamentals of Scala
scalable-data-science
Course in scalabe data science using Apache Spark over Databricks.
spark-config-and-tuning
spark性能调优总结 spark config and tuning
spark2-etl-examples
A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
spark_training
Sample Spark Code
xmas-tweets
An Apache Spark case study: Gathering Tweets about Christmas with Apache Spark Streaming. Sentiment Analysis with Spark Core NLP.