[WW]Connect 2016 workshop: Apache Spark For Everyone
---- http://spark.apache.org/downloads.htmlThis section contains steps to install Apache Zeppelin, RStudio, Databricks Community Edition, and Jupyter.
-
Interactive web-based notebook platform for data currently being incubated by Apache.
-
Multiple language backend including flavors of Spark, which means we don't have to install separate kernels, modules, plugins, or libraries to use it! Supports Scala(with Apache Spark), Python(with Apache Spark), SparkSQL, Hive, Markdown and Shell.
-
Learn more at https://zeppelin.incubator.apache.org/
Installing Zeppelin at the command line:
git clone https://github.com/apache/incubator-zeppelin
mvn install -DXmx512m -DXX:MaxPermSize=256m -DskipTests -Dspark.version=1.6.0 -Dhadoop.version=2.4.0
bin/zeppelin-daemon.sh start
For now, need to run a dependency cell as the very first cell, if you want access to spark-csv package
%dep z.reset() z.addRepo("Spark Packages Repo").url("http://dl.bintray.com/spark-packages/maven") z.load("com.databricks:spark-csv_2.11:1.4.0")
While Jupyter runs code in many different programming languages, Python is a prerequisite for installing Jupyter notebook.
https://www.continuum.io/downloads http://jupyter.readthedocs.org/en/latest/install.htmlconda install jupyter pip3 install jupyter pip install jupyter
Spark includes SparkR after version 1.4, including a REPL called sparkR. It can also be used within interactive notebook environments -- such as RStudio
For Mac: get the package here: https://cran.rstudio.com/bin/macosx/ or use a package manager like homebrewFor Windows: ?
https://www.rstudio.com/products/rstudio/download/A free version of databricks spark platform for learning. There is a waiting list for accounts. No local installation needed! Awesome!
Platforms:
- zeppelin http://zeppelin.incubator.apache.org/
- RStudio http://www.rstudio.com/products/rstudio/download/
- Jupyter http://jupyter.org/
- Databricks Community Edition: https://databricks.com/blog/2016/02/17/introducing-databricks-community-edition-apache-spark-for-all.html
Spark Further:
- Spark Programming Guide: http://spark.apache.org/docs/latest/programming-guide.html
- Get on mailing list and find out about conferences and events: http://spark.apache.org/community.html