Savadogo / Sparkify-project

Build a machine learning model on IBM watson with spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Predicting churn with Spark

This project has been realised as the capstone of my pathway to being a datascientist with Udacity nanodegree. I had to build a machine learning model to predict user churn for a music streaming platform (Sparkify).

Table of Contents

  1. Packages requirements
  2. File Descriptions
  3. Licensing, Authors, and Acknowledgements

Packages requirements

  • Python 3.*
  • pyspark
  • matplotlib
  • numpy
  • pandas

File Descriptions

The following files are in this repository:

  • Sparkify - worspace.ipynb - a notebook of the analysis on the small dataset of 128 MB. This notebook is exported into Sparkify - worspace.html.

  • Sparkify - ibm watson.ipynb - a notebook of the analysis on the meduim dataset of 231 MB realised on IBM Watson. This notebook is exported into Sparkify - ibm watson.html.

Results

A detailed article on the finding blog post available here.

Licensing, Authors, Acknowledgements

The code in this project is licensed under MIT license.

About

Build a machine learning model on IBM watson with spark


Languages

Language:HTML 70.9%Language:Jupyter Notebook 29.1%