idealo / terraform-emr-pyspark

Quickstart PySpark with Anaconda on AWS/EMR using Terraform

Home Page:https://medium.com/idealo-tech-blog/using-terraform-to-quick-start-pyspark-on-aws-2bc8ce9dcac

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Terraform + EMR Bootstrap PySpark with Anaconda

This code should help to jump start PySpark with Anaconda on AWS using Terraform.

Getting Started

  1. Install Terraform on Mac: brew install terraform
  2. Adjust the scripts (bootstrap_actions.sh and pyspark_quick_setup.sh) in scripts if necessary
  3. Set parameters in terraform.tfvars
  4. Start cluster:
terraform init
terraform apply
  1. Destroy cluster:
terraform destroy

Notes

  • Configure AWS on your local machine: aws configure
  • AWS instance cost for eu-central-1

Maintainers

Copyright

See LICENSE for details.

About

Quickstart PySpark with Anaconda on AWS/EMR using Terraform

https://medium.com/idealo-tech-blog/using-terraform-to-quick-start-pyspark-on-aws-2bc8ce9dcac

License:Apache License 2.0


Languages

Language:HCL 92.8%Language:Shell 7.2%