Caliban is a tool for developing research workflow and notebooks in an isolated Docker environment and submitting those isolated environments to Google Compute Cloud.
Caliban makes it astonishingly easy to develop and execute code locally, and then ship the exact same code up to a Cloud environment for execution on Big Iron machines.
To get started:
- see the Installation section below, then
- visit the short tutorial at the "Getting Started" section.
- Next steps for exploration are outlined at "What Next?", and
- the Overview provides some more flavor on the various subcommands that Caliban provides.
Full documentation for Caliban lives at Read The Docs.
“Be not afeard; the isle is full of noises,
Sounds, and sweet airs, that give delight and hurt not.
Sometimes a thousand twangling instruments
Will hum about mine ears; and sometime voices,
That, if I then had waked after long sleep,
Will make me sleep again: and then, in dreaming,
The clouds methought would open, and show riches
Ready to drop upon me; that, when I waked,
I cried to dream again.”-- Shakespeare, The Tempest
Caliban lives on PyPI, so installation is as easy as:
pip install -U caliban
If you want to make Caliban available globally, we recommend installing via pipx. Get pipx installed, and then run:
pipx install caliban
To run any commands, you'll need to install Docker, and make sure your Python version is >= 3.7.0:
$ python --version
Python 3.7.7
On a Mac, install python and Docker, and check if your installation is working by navigating to some empty folder and running:
$ caliban --help
usage: caliban [-h] [--helpfull] [--version]
{shell,notebook,build,run,cloud,cluster,status,stop,resubmit}
...
Our more detailed Getting
Started
documentation has instructions for Linux boxes, nvidia-docker
setup and Google
Cloud credential configuration. Armed with these tools you'll be able to run
scripts locally using a your GPU or submit caliban-dockerized jobs to Cloud.
Now that you have Caliban installed:
- see the Getting Started section below, or
- read the Overview for a discussion of Caliban's subcommands.
Caliban provides five subcommands that you run inside some project directory on your machine:
-
caliban shell
generates a Docker image containing any dependencies you've declared in arequirements.txt
and/orsetup.py
in the directory and opens an interactive shell in that directory. Thecaliban shell
environment is ~identical to the environment that will be available to your code when you submit it to AI Platform; the difference is that your current directory is live-mounted into the container, so you can develop interactively. -
caliban notebook
starts a Jupyter notebook or lab instance inside of a docker image containing your dependencies; the guarantee about an environment identical to AI Platform applies here as well. -
caliban run
packages your directory's code into the Docker image and executes it locally usingdocker run
. If you have a GPU, the instance will attach to it by default - no need to install the CUDA toolkit. The docker environment takes care of all that. This environment is truly identical to the AI Platform environment. The docker image that runs locally is the same image that will run in AI Platform. -
caliban cloud
allows you to submit jobs to AI Platform that will run inside the same docker image you used withcaliban run
. You can submit hundreds of jobs at once. Any machine type, GPU count, and GPU type combination you specify will be validated client side, so you'll see an immediate error with suggestions, rather than having to debug by submitting jobs over and over. -
caliban build
builds the docker image used incaliban cloud
andcaliban run
without actually running the container or submitting any code. -
caliban cluster
creates GKE clusters and submits jobs to GKE clusters.
This first example will show you how to use Caliban to run a short script inside of a Caliban-generated Docker container, then submit that script to AI Platform.
Make a new project folder and create a small script:
mkdir project && cd project
echo "import platform; print(f\"Hello, World, from a {platform.system()} machine.\")" > hello.py
Run the script with your local Python executable:
$ python hello.py
Hello, World, from a Darwin machine.
Use Caliban to run the same script inside a Docker container:
caliban run --nogpu hello.py
...elided...
0611 15:12:44.371632 4389141952 docker.py:781] Running command: docker run --ipc host 58a1a3bf6145
Hello, World, from a Linux machine.
I0611 15:12:45.000511 4389141952 docker.py:738] Job 1 succeeded!
Change a single word to submit the same script to Google's AI Platform:
caliban cloud --nogpu hello.py
(For this last step to work, you'll need to set up a Google Cloud account by following these instructions).
This next example shows you how to do interactive development using caliban shell
. Once
you get your script working, you can use caliban run
and
caliban cloud
on the
script, just like above.
Run the following command in the project
folder you created earlier:
caliban shell --nogpu
If this is your first command, you'll see quite a bit of activity as Caliban downloads its base image to your machine and builds your first container. After this passes, you should see Caliban's terminal:
I0611 12:33:17.551121 4500135360 docker.py:911] Running command: docker run --ipc host -w /usr/app -u 735994:89939 -v /Users/totoro/code/example:/usr/app -it --entrypoint /bin/bash -v /Users/totoro:/home/totoro ab8a7d7db868
_________ __ ________ ___ _ __ __ __
/ ____/ | / / / _/ __ )/ | / | / / \ \ \ \
/ / / /| | / / / // __ / /| | / |/ / \ \ \ \
/ /___/ ___ |/ /____/ // /_/ / ___ |/ /| / / / / /
\____/_/ |_/_____/___/_____/_/ |_/_/ |_/ /_/ /_/
You are running caliban shell as user with ID 735994 and group 89939,
which should map to the ID and group for your user on the Docker host. Great!
[totoro@6a9b28990757 /usr/app]$
You're now living in an isolated Docker container, running Linux, with a clean virtual environment available. Your home directory and the folder where you ran the command are both live-mounted into the container, so any changes you make to either of those directories will be reflected immediately.
Type C-d
to exit the container.
Create a new file called requirements.txt
in the folder and add the
tensorflow
dependency, then run caliban shell
again:
echo tensorflow >> requirements.txt
caliban shell --nogpu
You should see more activity as caliban
builds a new container with
tensorflow
installed. This time, inside your container, run python
and check
that tensorflow
is installed:
$ python
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'2.2.0'
python
Any code you write in this folder will be accessible in the shell. If it works
at caliban shell
, you can be almost certain that your code will execute in a
Cloud environment, with potentially many GPUs attached and much larger machines
available.
and you didn't have to write a single Dockerfile!
Next, you might want to explore:
- triggering hundreds of jobs from an experiment config
- Submitting jobs to Google Cloud with
caliban cloud
- Working in a Jupyter notebook in the same, isolated environment where your
production code will run via
caliban notebook
There is a lot to explore. Head over to Caliban's documentation site and check out the links on the sidebar.
If you find anything confusing, please feel free to create an issue on our Github Issues page, and we'll get you sorted out.
This is a research project, not an official Google product. Expect bugs and sharp edges. Please help by trying out Caliban, reporting bugs, and letting us know what you think!
Please refer to our Contributor's Guide for information on how to get started contributing to Caliban.
If Caliban helps you in your research, pleae consider citing the repository:
@software{caliban2020github,
author = {Vinay Ramasesh and Sam Ritchie and Ambrose Slone},
title = {{Caliban}: Docker-based job manager for reproducible workflows},
url = {http://github.com/google/caliban},
version = {0.1.0},
year = {2020},
}
In the above bibtex entry, names are in alphabetical order, the version number is intended to be that of the latest tag on github, and the year corresponds to the project's open-source release.
Copyright 2020 Google LLC.
Licensed under the Apache License, Version 2.0.