ezhaar / jupyter-sparkmagic

dockerized jupyter with sparkmagic support to run against HDInsight clusters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Docker image to run Jupyterlab

This image provides an image based on Alpine Linux to run Jupyter. The image has sparkmagic installed already so that it can be used against a remote HDInsight cluster.

Create a directory on your local machine to store your notebooks. e.g.

$ mkdir /home/<username>/notebooks

Configuration

If you are planning to use jupyter against an HDInsight cluster, please Update the username, password and cluster url in the config.json file.

  • Encode password with base64
$ echo -n 'secret-hdinsight-password' | base64
  • Copy the encoded base64 password and update the base64_password field in sparkmagic/config.json for the required kernels.
  • Update the url field in sparkmagic/config.json with the url of HDInsight cluster.

Usage

Run the docker container

$ docker build . -t jupyter-sparkmagic
$ docker run -d \
    -p 9999:9999 \
    --mount type=bind,source=/home/<username>/notebooks,target=/root/notebooks \
    --mount type=bind,source=$(pwd)/.sparkmagic,target=/root/.sparkmagic \
    jupyter-sparkmagic

Access jupyter by opening localhost:9999 in a browser and use supersecret as the password.

Add Python Packages

In order to add additional python packages to the container, update the requirements.txt file by adding the desired packages. Once all the packages have been added, rebuild the image and start a new container.

ToDo:

  • pre-commit hook to strip output from the notebooks/nbdime

About

dockerized jupyter with sparkmagic support to run against HDInsight clusters


Languages

Language:Python 99.5%Language:Shell 0.5%