acroz / sparkmagic-talk

Slides and a demo notebook from my talk at PyCon UK 2018.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using Spark from Python and Jupyter

This repo contains slides and a demo notebook from my talk at PyCon UK 2018.

You can watch the talk on youtube.

Content

In this talk I presented:

  • A brief introduction to Apache Spark.
  • Connecting to a Spark cluster running the Apache Livy REST interface from Jupyter with sparkmagic and any Python code with pylivy.
  • The basics of loading data into Spark, manipulating it and doing analysis with MLlib.
  • Retrieving data back into Jupyter or Python for further analysis.
  • An example web app using Plotly Dash, Python RQ and pylivy to build a Spark-powered dashboard using only Python.

Questions and feedback

Any questions or feedback are welcome either as GitHub issues on this repo, or directly over email at wacrozier@gmail.com.

Contribute

pylivy doesn't yet support nearly all the features provided by Livy. If you'd like to contribute please get in touch!

About

Slides and a demo notebook from my talk at PyCon UK 2018.


Languages

Language:Jupyter Notebook 100.0%