gregoryg / ds-for-telco

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ds-for-telco

Created by Juliet Houghland + Sandy Ryza (juliet@cloudera.com)
The source notebook demonstrates building a churn prediction model using Spark and Spark MlLib's pipeline API for cross validation and model tuning. The Pipeline API is available in PySpark in version 1.6 or higher.

Status: Demo Ready
Use Case: Telco Churn Prediction

Steps:

  1. Open a terminal and run setup.sh
  2. Create a Python Session and run setup.py
  3. In your python session run ds-for-telco.py
  4. When finished, run cleanup.sh in the terminal

Recommended Session Sizes: 2 CPU, 4 GB RAM

Estimated Runtime:
ds-for-telco.py --> approx 1 min

Recommended Jobs/Pipeline:
None

Demo Script
TBD

Related Content:
http://blog.cloudera.com/blog/2016/02/how-to-predict-telco-churn-with-apache-spark-mllib/

About

License:Apache License 2.0


Languages

Language:Jupyter Notebook 96.9%Language:Python 3.1%Language:Shell 0.0%