adityakommu / Pyspark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pyspark-on-AWS-EMR

An implement of PySpark on AWS EMR

  1. Upload requirement script on S3 bucket

  2. Create Key pair on EC2

  3. Create cluster on EMR

  4. Create notebook on EMR while linking it to the previous cluster

  5. Run pyspark code using AWS sample Data in S3 in the notebook

About


Languages

Language:Jupyter Notebook 99.4%Language:Shell 0.6%