USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Home Page:http://irds.usc.edu/sparkler/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve deployments for different architectures

buggtb opened this issue · comments

Target users: Joe and his laptop, Mac, Windows and Linux how do we support these users?

Then when they want to scale up how do you deploy SCE alongside an existing Spark cluster, run the crawl in the cluster and get the output.

How do you make the best out of cloud services and deploy into AWS/GCE/Azure?