oscar6echo / jupyter-on-google-cloud

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to set up a remote Jupyter on Google Cloud

A - Overview

Why do that instead of a local Jupyter, which is even simpler to install and run ?

  • Because you can access a powerful machine, with big CPU/GPU, RAM, disk space, etc.
  • Typical use case: Machine learning.

Any difference ?

  • You need internet access.
  • The cloud machine is Linux.

How long does it take to set up ?

  • Less than 5mn the first time, the longest part is required only once.
  • Less than 1mn if you install Miniconda and have all the bash scripts ready.
  • The longest part is the Anaconda installation itself.
  • Less than 30s from the second time if you have created a disk snapshot from the first installation.

How much does it cost ?

  • It depends on the type of machine (VM) you use, and how long.
  • See Google Cloud Compute Engine pricing.
  • IMPORTANT: Do not forget to stop your VM after you're done to avoid paying for nothing.

B - Step by step guide

0 - Prerequisite

You need:

  • Internet access
  • a Google identity - you have one with a gmail account for example

1 - Install gcloud command line tool on local machine

gcloud auth login
  • Then set up your environment (doc):
gcloud init
  • Finally check your status (doc):
gcloud config list

2 - Create new project on Google Cloud

My recommendation is to it from the the gcloud console the first time:

Next time you can simply the gcloud command line (doc), for example:

gcloud projects create "myuniqueeprojectname" --name "my project human readable name"

3 - Create remote VM on Google Cloud

My recommendation is to it from the the gcloud console the first time:

  • Go to Compute Engine / VM instances
  • Create a VM. There is a lot of choice. You are asked about the following main characteristics
    • name
    • zone (choose your area, obviously - this cannot be changed later)
    • machine type, CPU and memory
    • Boot disk
    • firewall rules

One you have customized your machine to your taste, you can get the equivalent REST of command line instructions, at the bottom of the creation page.

Example for the command line, with main options only:

gcloud compute instances create myserver \
    --image-project "ubuntu-os-cloud" \
    --image "ubuntu-1404-trusty-v20170831" \
    --zone "europe-west3-b" \
    --machine-type "n1-highmem-4"

4 - Check remote VM is up and running

  • From the gcloud console in Compute Engine / VM instances
  • Or command line:
gcloud compute instances list

5 - Create ssh keys on local machine

  • In terminal run gcloud compute config-ssh. This will check if an RSA Pub-Prv key pair exists or create one if not.

  • Check result in ~/.ssh

cat config # human readable info
cat google_compute_engine # RSA private key
cat google_compute_engine.pub # RSA public key
cat google_compute_known_hosts # Google remote machines confirmed as known by user
  • For more info about gcloud ssh instructions: see the doc
  • For more info about ssh independently of gcloud see the github help page for example.

6 - Log in remote VM from local machine

Using the ssh keys created in previous step.

  • Terminal:
ssh myserver.europe-west3-b.remotejupyter
  • Alternative syntax:
gcloud compute --project "remotejupyter" ssh --zone "europe-west3-b" "myserver"

The ssh keys are not necessary if you log in the VM from the SSH / Open in browser window drop down menu.

  • from gcloud console / Compute Engine / VM instances, click on the SSH button for your VM

7 - Install software on remote VM

The followings instructions must be run on your remote RM.
You can install the Anaconda or Miniconda distributions or both.
You can have any number of Anaconda2/3 or Miniconda2/3 distributions installed side by side.

# update package manager
sudo apt-get update

# install utilities
sudo apt-get -y install bzip2 wget git

# ANACONDA
# download Anaconda linux version (link from page https://www.continuum.io/downloads)
anaconda="Anaconda3-4.4.0-Linux-x86_64.sh" # update if necessary
wget -P Downloads/ https://repo.continuum.io/archive/${anaconda}
# install anaconda - accept default options except yes to prepend anaconda path to PATH
bash ~/Downloads/${anaconda}

# MINICONDA
# download Miniconda linux version (link from page https://www.continuum.io/downloads)
miniconda="Miniconda3-4.3.14-MacOSX-x86_64.sh" # update if necessary
wget -P Downloads/ https://repo.continuum.io/miniconda/${miniconda}
# install miniconda - accept default options except yes to prepend anaconda path to PATH
bash ~/Downloads/${miniconda}

# run .bashrc to update path
. ~/.bashrc

# update python packages - using conda - example
conda update -y conda jupyter jupyter_client jupyter_console jupyter_core \
                ipython scipy numpy matplotlib pandas

# update python packages - using conda - example
pip install ezhc ezvis3d

The VM is all set.

8 - Set up port forwarding on local machine

  • Forward a local port (8888) to the server’s port (8888) where jupyter server is running:
ssh myserver.europe-west3-b.remotejupyter -NL 8888:localhost:8888
  • Alternative syntax:
gcloud compute ssh --project "remotejupyter" --zone "europe-west3-b" "myserver" -NL 8888:localhost:8888

9 - Launch jupyter on remote VM

  • Create a directory to contain your notebooks and launch jupyter from there:
mkdir notebooks
cd notebooks
jupyter notebook --no-browser --port=8888
  • The terminal will show something along these lines:
Olivier@myserver:~/notebooks$ jupyter notebook --no-browser --port=8888
[I 11:42:10.851 NotebookApp] Serving notebooks from local directory: /home/Olivier/notebooks
[I 11:42:10.851 NotebookApp] 0 active kernels 
[I 11:42:10.851 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/?token=31913c18411cf0fe2593bfb8e0136631c7f5fadac3b62f4a
[I 11:42:10.851 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 11:42:10.851 NotebookApp] 
    
    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=31913c18411cf0fe2593bfb8e0136631c7f5fadac3b62f4a
  • Copy the token

10 - Open browser local machine

  • Open web browser to localhost:8888
  • Paste the token from the jupyter server as requested

All good.
You can start working !

11 - Stop or delete remote VM when you are finished

Be careful it is all too easy to forget !

Stopping a VM does not delete is completely and consequently carries residual costs.
See Google doc

Before deleting a VM, you might want to take a snapshot of the VM persistent disk to quickly back up the disk so you can recover lost data, transfer contents to a new disk.

Several ways to do that from gcloud console / Compute Engine:

  • In menu Snapshots, take a snapshot of the VM disk. You must stop the VM to do so
  • In the VM dashboard, untick Delete boot disk when instance is deleted to make sure you will not lose anything
  • You can then create a new instance by choosing as boot disk this snapshot
  • You may also create an image from a snapshot and create an instance from an image
  • For more info about images and snapshots, see the doc. Essentially snapshots are faster and cheaper

C - More with Jupyter on Google Cloud

The following pages describe more specific Jupyter installations:

The following pages describe JupyterHub installations:

About