This repository contains a Proof of Concept on how to integrate Jupyter Notebooks with MLflow, for AI models versioning and serving, and SFTP & Minio for artefacts storage.
The PoC has been implemented in a way to mimic MLflow running as a remote tracking server, as a Docker container, and the SFTP and Minio also running as docker containers. To ease the way the Jupyter notebooks communicate with the MLflow tracking server, Jupyter Lab has also runs within a Docker container.
All the three containers run with the assistance of Docker Compose, which also configures a custom network which is shared amongst the containers.
To get this PoC running on a MacBook Pro, one needs to install the following applications:
If you are not a fan of Docker and would like to have only JupyterLab and MLflow running locally, you will need the following:
Although you are encouraged to read all the sections below in order to understand better the whole setup, let's save some time and have a more concise step-by-step setup.
- Run the Docker containers:
docker-compose up
- The first run will take some time because it will have to pull some images and build other.
- Once the containers are running, open another
terminal
tab and make thecopy_known_hosts.sh
and thecreate_experiments.sh
executables:
chmod +x copy_known_hosts.sh
chmod +x create_experiments.sh
- Now copy the
known_hosts
to the JupytLab container:
./copy_known_hosts.sh
- Create the experiments in the MLflow container:
./create_experiments.sh
- Create a bucket on Minio:
- Go the the Minio UI on
http://localhost:9000
- Click on the + sign on the bottom-right corner
- Create a bucket called
ai-models
- Open JupyterLab:
- Got to your browser and type
http://localhost:8991/lab
- Open the
conv-net-in-keras.ipynb
notebook and run all the cells - Go to MLflow UI on
http://localhost:5500
and check the experiments - Go to Minio UI and check the content of the bucket
If you want to understand why those steps were made, please keep reading.
The three containers used are named as:
ekholabs-minio
:- based on the
minio/minio
image.
- based on the
ekholabs-sftp
:- based on the
atmoz/sftp
image.
- based on the
ekholabs-mlflow
:- built from the
Dockerfile.mlflowserver
.
- built from the
ekholabs-jupyterlab
:- built from the
Dockerfile.jupyterlab
.
- built from the
A picture is worth a thousand words. When the picture is in ASCII
, it's
even better! ;)
-- MacBook Dev Environment ----------------------------------
| |
| -- Docker Engine ------------------------ |
| | | |
| ------ | ------------ -------- | |
| | User | -------> | JupyterLab | -------> | MLflow | | |
| ------ | ------------ -------- | |
| ^ | | / | | |
| | | | / | | |
| --------|------------|---------------- | | |
| | | | | |
| | | --------- | | |
| | | | | | |
| | V V | | |
| | ------ ------- | | |
| | | SFTP | | Minio | <---- | |
| | ------ ------- | | |
| | ^ | | |
| | | | | |
| | --------------------- | |
| ----------------------------------------- |
| |
-------------------------------------------------------------
- User runs models on notebooks served by JupyterLab;
- JupyterLab notebooks store metrics, parameters and model on MLflow file storage;
- JupyterLab notebooks store artefacts (aka model files) in the SFTP or Minio server, it depends on which experiment id is being used. The files are identified by the run id from MLflow;
- The user can browse the experiments on MLflow via its UI.
- The buckets kept on Minio are accessible via its UI.
For the purpose of this PoC, the containers run on a local machine, MacBook Pro. However, for a robust and resilient experience, it is recommended that both MLflow and SFTP servers run on different machines. In addition to that, it is expected that the volumes used for storage are properly backed up.
Running the containers with docker-compose
is a no brainer. To accomplish that,
please do the following:
- Run ->
docker-compose up
- The command above should be enough to build the images based on their respective docker files and download the images used for the SFTP and Minio servers.
Although you can also run on detached mode, I do recommend that the first run is nicer without it. Why? Because you can follow up on all the green letters printed out on your terminal. Besides that, if not enough, it's good to get acquainted with the logs of the services, in case you have errors happening during startup.
- Run on detached mode ->
docker-compose up -d
After starting the containers, you can access both JupyterLab, MLflow and Minio in the following way:
- JupyterLab: http://localhost:8991/lab
- Copy the token printed out on the terminal to be able to access JupyterLab
- MLflow: http://localhost:5500
- Minio: http://localhost:9000
You will notice on the MLFow UI that only one experiment is available, the Default
one.
The intention behind this exercise is to be able to use SFTP and Minio as storage.
Hence, the Default
experiment is not a good choice to make.
Let's explore other experiments.
To start with, we have to create the experiments in the MLflow server. But how to do it? Easy: connect to the container and execute one command. Of course, once you are in the container.
So, to start with, let's connect to the ekholabs-mlflow
. To do that, run
the command below:
docker exec -it ekholabs-mlflow /bin/bash
If you type mlflow --help
you will see a list of possible commands
and options
.
To create our experiments, which will use SFTP and Minio as storage server, just
execute the create_experiments.sh
script.
I also advice you to have a look at the script to understand how the experiments are created.
If you face issues when running the script, please make sure the containers and running
and that the script is executable (chmod +x create_experiments.sh
).
The script should be executed from the project root in the following way:
- Run ->
./scripts/create_experiments.sh
Now, if you go back to the MLflow frontend, you will see that two experiments have been created.
Before we get to the Jupyter notebooks, let's understand how the communication between the services work and what we need to do to work around some issues of having it all locally.
If you take a look at the docker-compose.yml
file, you will easily notice that the
ekholabs-sftp
container has the ssh_sftp_key.pub
file mounted as a volume. That
helps to ease communication with the SFTP server without using passwords. It also
means that the private key has to be added to the other containers that will have to
communicate with the SFTP server.
Hence, looking further into the docker-compose.yml
file, you will notice that both
ekholabs-mlflow
and ekholabs-jupyterlab
have the ssh_sftp_key
file mounted
as a volume. Along with that file, which is the private key, we also have a SSH config
file mounted as a volume.
Almost there... hang on.
Besides having both private key and SSH config files mounted on the volumes, we
need one last thing: the ekholabs-sftp
has to be added to the known_hosts
, which
goes under ~/.ssh/known_hosts
inside the containers.
Adding that extra information in the MLflow container is pretty easy. It comes with OpenSHH installed, so just running the command below does the trick:
ssh-keyscan -H ekholabs-sftp >> ~/.ssh/known_hosts
Please, do not execute the line above, this command is already part of the docker-compose.yml
,
which means that the host will be added to the known_hosts
file of the MLflow container
automatically at start-up.
However, when it comes to the JupyterLab container, we do have an issue: it does not
contain the OpenSHH packaged and we are not allowed to install it. Hence, ssh-keyscan
won't work.
What's the problem with that? Well, when the notebook tries to log the artefact on the
tracking server (MLflow), pysftp/Paramiko
will complain and throw and exception saying:
No hosts for key!
So, it can only mean one thing: we have to get the known_hosts
from the ekholabs-mlflow
container into the ekholabs-jupyterlab
container. But how? Well, with Docker! I mean,
with a shell script that will run some commands in the Docker containers.
Take a peak inside the copy_known_hosts.sh
shell script to understand what it's doing.
Once you done, please execute that script from the root directory of the project.
- Run -->
./scripts/copy_known_hosts.sh
Now you are - almost - good to go!
P.S.: are you better than me at Docker sheiße? Please, contribute to this repo and get rid of so many steps. :)
Well, completely unrelated, but I have a friend called Leonardo Bucket. Every time
I do something with S3 and have to say or write the word bucket
, it reminds me of him. :D
So, if we want to use Minio as storage, we do need a bucket. And, more over, the bucket has to be created before we try to store artefacts.
If you have looked inside the create_experiments.sh
, you might have noticed that
we expect a bucket called ai-models
, not Leonardo, there.
To create a bucket, just go to Minio (http://localhost:9000) and the rest you should know.
Ah!, do you need a key id and secret combination to login? Have a docker-compose.yml
file, the Minio key id / secret pair is informed there.
Done with the bucket? If so, now you are good to go. ;)
There is not much to say here. To actually see how it's done, go to your JupyterLab
frontend and open the conv-net-in-keras.ipynb
notebook. The last cell contains all
the magic you need.
This repository already contains the SSH keys that should be used to communicate with the SFTP service, from both MLflow and Jupyter notebooks perspective.
If this setup is used to deploy a SFTP server for multiple users, with different SSH keys presumably, then the following information from the sub-session bellow has to be taken into account.
Keys generated on MacBooks work when connecting to the SFTP server using the sftp
command from the shell. However, when the connection is established from within Jupyter
the pysftp
library is used, which uses Paramiko
under the hood. And that brings issues!
Do not waste your time googling a solution, trust me. The easiest / quickest thing
to do is generate your keys on a Linux machine / Docker container. The key files
under the keys
directory have been created on a Docker container.
This is working in progress and more about MLflow features, like service models, will be added soon.