jcrist / dask-tutorial-pydata-seattle-2017

Dask Tutorial for PyData Seattle 2017

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dask Tutorial for PyData Seattle 2017

This contains materials for the dask tutorial Parallelizing Scientific Python with Dask.

Setup

This tutorial is designed to run in an online environment. You shouldn't need to install anything, instead just go to the following link and click the big blue button:

https://pycon-parallel.jovyan.org

Running Locally

If you want to run this tutorial locally instead, you can use the following instructions. Note that the material in notebooks 3 & 4 will not work 100% without the environment setup in the link above. Everything else should work fine though.

To setup locally, clone the repo and install all required dependencies:

$ git clone https://github.com/jcrist/dask-tutorial-pydata-seattle-2017
$ conda install dask distributed matplotlib s3fs jupyter -c conda-forge
$ pip install graphviz

Then start a jupyter notebook inside the cloned directory

$ cd dask-tutorial-pydata-seattle-2017
$ jupyter notebook

Acknowledgements

This tutorial wouldn't be possible without the work done by the larger Dask community implementing much of the functionality found here. The materials here are based off of a previous dask tutorial by Matt Rocklin.

We also thank Google for generously providing compute credits on Google Compute Engine, which was backed the distributd clusters used during the tutorial at PyData Seattle.

About

Dask Tutorial for PyData Seattle 2017


Languages

Language:Jupyter Notebook 96.7%Language:Python 3.3%