tspannhw / scala-for-data-science

Materials for Scala Days and Strata talks "Scala: The Unpredicted Lingua Franca for Data Science"

Home Page:http://event.scaladays.org/scaladays-nyc-2016#!#schedulePopupExtras-7539

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scala for Data Science

These notebooks and other material are for the Scala Days 2016 and Strata London 2016 talks by Andy Petrella and Dean Wampler on why Scala is a great language for Data Science.

The talk is organized as a series of notebooks, install Spark Notebook, then run it with the following command, where we assume that $SPARK_NOTEBOOK_HOME is where you installed it and you are running the command from this directory, $PWD (the full path is required for the notebooks argument):

export NOTEBOOKS_DIR=$PWD/notebooks
$SPARK_NOTEBOOK_HOME/bin/spark-notebook

For Windows, use the following:

set NOTEBOOKS_DIR=%CD%\notebooks
%SPARK_NOTEBOOK_HOME%\bin\spark-notebook

Then open the notebooks (e.g., WhyScala). To evaluate all the cells in a notebook, use the Cell > Run All menu item.

Grab the slides here.

About

Materials for Scala Days and Strata talks "Scala: The Unpredicted Lingua Franca for Data Science"

http://event.scaladays.org/scaladays-nyc-2016#!#schedulePopupExtras-7539