sachua / mlflow-docker-compose

MLflow deployment with 1 command

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

latest commits recreated volumes?

mathematicalmichael opened this issue · comments

what happened with the last few commits, exactly? I noticed you updated an image to address "command not found" but I think in doing so all the data in my volume was wiped out. i lost over an hour trying to find it and eventually couldn't (and noticed the disk shrank), had to get it from a daily snapshot (thank goodness for that).

is this an issue with my filesystem / docker somehow? or are there commands in the docker compose file that somehow could have removed the data?

I'm noticing the files were in docker/overlay2/l/ instead of docker/volumes/ ... that makes me suspect that my previous deployment was never writing to the minio_data volume at all but rather to the overlay filesystem inside the container, which is why creating a new container made the files disappear. that would explain why upgrading made it look like all my files disappeared...the container changed.

hi @mathematicalmichael, this project was originally created for a very simple proof of concept of a working MLFlow server using docker-compose. Therefore the mc command was supposed to recreate the mlfow bucket when the docker-compose service was restarted.

I noticed the mc service was not running as intended to automatically create the mlflow bucket in minio for mlflow to store its objects, and updated the image to a older version that was able to run the nc command properly. I suspect that your original issue of requiring a newer minio version was probably linked to the previous newer version of the mc image used, and that the mc service was never running properly in the first place. I have removed the command that removes the mlflow bucket in 629fe84

However, the files should have been stored in the minio_data volume. I have just tested that the minio_data volume was indeed created and the files were stored there. Can you run a docker volume ls to confirm that the volumes were created? If the files were not stored in the minio_data volume mount the files will not be moved to the new container when your container changes, which would seem like your files have "disappeared".

now they are, but at the time that I originally set them up, it was storing everything in overlay storage, not in the volume.

I suppose the first time I spun them up, the command was missing, so when it finally started working, it deleted everything. that could be it. but there were non-mlflow buckets that disappeared as well... something very odd happened with the docker volume indeed.

the PR I submitted a while back was what I was running for months. Then I upgraded, restarted the service, and all my data was gone. Irreversibly. I restored from backup the night before.

I'll be more careful validating, but this time I could indeed SEE data created within the volume when writing to the buckets, so I hope this is "good to go" from here on out. I'll keep trying to update versions and submitting PRs here, because I don't see another more-popular repo that deploys this same stack. If you know of any, please point me to them.

"mlflow docker compose" yields this repo really high up, most recent popular result.

It is strange that mc in the stack depends on an image that is so old, that feels wrong to have it frozen in time.

I suppose one option is to "extend" the official minio stack https://docs.min.io/docs/deploy-minio-on-docker-compose.html somehow, or just run a stack that interacts with it and only handles mlflow.

The reason why mc was required is to automate the creation the mlflow bucket that will be used as the artifact store for MLflow when the MinIO service is first initialised. The link you shared is to deploy a distributed MinIO stack and will not solve this issue.

I also like the idea to be able to just git clone this repo and immediately run a docker-compose up to get a fully working MLflow stack. If we remove MinIO from the services it would no longer be a one-command deployment, as we would need to separately deploy a MinIO and then set-up MLflow to use it as an object store.

However, I have changed the mc service to use a wait-for-it.sh script to check if MinIO is ready before creating the bucket, instead of relying on the nc which was removed in the newer versions of mc images, so it no longer depends on an older image!

Doing a search in GitHub I found a few repos such as https://github.com/ymym3412/mlflow-docker-compose and https://github.com/Toumash/mlflow-docker, but I have not used them personally. You can check them out if you're interested. Of course, I will still try to support this repo to the best of my ability.