fabsta / spark_notebooks

Collection of spark notebooks (zeppelin)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Spark data science zeppelin notebooks

#Advanced analytics with Spark (Book)

Description Language ML-Component Link View
AAS (Chapter 3): Recommending music (alternating least squares recommender) Spark
AAS (Chapter 4): Predicting forest cover with decision trees Spark Decision Tree
AAS (Chapter 5): Anomaly Detection in Network Traffic with K-means clustering Spark K-Means
AAS (Chapter 6): Understanding Wikipedia with Latent Semantic Analysis Spark
AAS (Chapter 7): Analyzing Co-occurrence Networks with GraphX Spark
AAS (Chapter 8): Geospatial and Temporal Data Analysis on the New York City Taxi Trip Data Spark
AAS (Chapter 9): Estimating Financial Risk through Monte Carlo Simulation Spark Monte-Carlo Simulation
AAS (Chapter 10): Analyzing Genomics Data and the BDG Project Spark
AAS (Chapter 11): Analyzing Neuroimaging Data with PySpark and Thunder Spark

#Machine learning

Description Language ML-Component Link View Author
Kaggle- AirBNB - Data exploration Python feature engineering ipython_notebooks https://www.kaggle.com/c/titanic/details/getting-started-with-python
Kaggle- AirBNB - Data exploration 2 Python feature engineering ipython_notebooks https://www.kaggle.com/dimon009/airbnb-recruiting-new-user-bookings/airbnb-exploratory-analysis

#Other

Description Language ML-Component Link View Author
Kaggle- Titanic survival prediction SparkSQL 2BEVBMRCY https://www.kaggle.com/c/titanic/details/getting-started-with-python
earthquake visualisation Spark 2BEPUUY8B

About

Collection of spark notebooks (zeppelin)


Languages

Language:Jupyter Notebook 100.0%