abhi117a / Python

Practicing Apache Spark with Python. Starting from word count program. Finally creating Movie Recommender System on Amazon EC2 Clusters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python, Apache Spark, Amazon EC2.

Practicing Apache Spark with Python. Starting from word count program. Finally creating Movie Recommender System on Amazon EC2 Clusters Data source: https://grouplens.org/datasets/movielens/ For Local system you can use Data Set "MovieLens 100K Dataset"(Smaller DataSet) once you start working on clusters you can pick up "MovieLens 1M Dataset" or "10M" dataset. You will need Python 2.x, A Python Editior recommended[PyCharm, Canopy, JupyterNotebook], Apache Spark Latest Version, Amazon AWS cloudservice access[Not Necessary]

About

Practicing Apache Spark with Python. Starting from word count program. Finally creating Movie Recommender System on Amazon EC2 Clusters


Languages

Language:Jupyter Notebook 51.0%Language:Python 49.0%