datitran / emr-bootstrap-pyspark

Quickstart PySpark with Anaconda on AWS/EMR

Home Page:https://medium.com/@datitran/quickstart-pyspark-with-anaconda-on-aws-660252b88c9a?source=user_profile---------15----------------

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EMR Bootstrap PySpark with Anaconda

This code should help to jump start PySpark with Anaconda on AWS.

Getting Started

  1. conda env create -f environment.yml
  2. Fill in all the required information e.g. aws access key, secret acess key etc. into the config.yml.example file and rename it to config.yml
  3. Run it python emr_loader.py

Requirements

Copyright

See LICENSE for details. Copyright (c) 2016 Dat Tran.

About

Quickstart PySpark with Anaconda on AWS/EMR

https://medium.com/@datitran/quickstart-pyspark-with-anaconda-on-aws-660252b88c9a?source=user_profile---------15----------------

License:MIT License


Languages

Language:Python 92.5%Language:Shell 7.5%