This project has been realised as the capstone of my pathway to being a datascientist with Udacity nanodegree. I had to build a machine learning model to predict user churn for a music streaming platform (Sparkify).
- Python 3.*
- pyspark
- matplotlib
- numpy
- pandas
The following files are in this repository:
-
Sparkify - worspace.ipynb
- a notebook of the analysis on the small dataset of 128 MB. This notebook is exported intoSparkify - worspace.html
. -
Sparkify - ibm watson.ipynb
- a notebook of the analysis on the meduim dataset of 231 MB realised on IBM Watson. This notebook is exported intoSparkify - ibm watson.html
.
A detailed article on the finding blog post available here.
The code in this project is licensed under MIT license.