allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Home Page:https://clear.ml/docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Change folder for debug samples

daandres opened this issue · comments

Question already asked on Stackoverflow, but with no resolution:
https://stackoverflow.com/questions/77368228/clearml-change-debug-samples-output-destination-after-moving-task-to-another-pro

I start a ClearML Task with .init() and then start my application, which is tracked by the automatic logging. However, I get my project and task name when the application starts because my application reads a config file that does not exist before the init() call. So I created a temporary task and later moved it to the correct project and renamed the task task.move_to_project(new_project_name=project_name) and task.rename(new_name=task_name). But the debug samples won't be moved to the new project structure and task name. I found the option to update the output_uri for the model repository with task.output_uri = folder, but this only sets the model repository to the new location and not the location for the debug samples.

My questions are

How can I dynamically change the location of the debug samples?
How can I move existing debug samples to a new location and update existing tasks?
Pseudo code:

# .. do custom initialization
project_name = read_config('project_name')
task_name = read_config('task_name')

task.move_to_project(new_project_name=project_name)
task.rename(new_name=task_name)

new_output_path = '~/clearml/'
task.output_uri = new_output_path
# TODO (1) change logging directory for debug_samples

I am using the latest clearml from pypi and the latest server version. My framework is PyTorch (Lightning). I'm running on Ubuntu.

Thanks.

Expected behaviour

Debug images get logged in the folder, where I set it ( not default of ClearML instance).

Environment

  • self hosted
  • 3.8
  • Ubuntu 22.04
  • single user
  • task execution on the same machine as the ClearML Server (Web and File)

Hi @daandres, when referring the folder for the debug samples. do you mean the target URL/destination where they are stored, or (in case they are stored on the file server) the actual folder inside the file server's data disk they are stored in?

currently, the debug samples gets uploaded to the fileserver and saved under /opt/clearml/data/fileserver//.

However, /opt/ is on my a disk with less storage. Therefore I want to change the directory for the fileserver.
I would prefer to save the debug samples of a task in my logging_dir, where my tensorboard logs gets saved. But I cannot change this dir.

Also I create the task with a temporaray project name, and than I calculate something inside the task and move the task to another project. But, the debug samples remains in the fileserver folder temp_project_name/<task_id> and won't get moved to the new project name. Therefore it is hard to remove outdated debug files manually.

Thanks

This sounds like a deployment issue - you can set up the fileserver so that the data dir will be different, if that's what's blocking you.
As for the temporary project name, the setting for the specific project dir used when uploading is resolved at initialization time, which is why your changes do not affect it. Can you simply calculate it first and only than call Task.init()?

I was thinking about dynamically change the fileserver save dir of a task, as I have multiple storage devices depending on the needs. Instead of reconfiguring the whole fileserver to another device. I have multiple detachable big storage devices.

I want to keep track of all steps involving a task, so I want ClearML tracking all my logs, thats why I first set up a temporary task to log all things and change it directly I know the project name.

Hi @daandres understood. Currently (and for reasons of security best practices), the fileserver uses a single root folder where all files are stored, and does not allow any external intervention regarding the root folder used.

Regarding changing the storage path after the task has initialized and the project changed, this is currently not supported, we can certainly add that as an open issue

thanks for clarification.
how can I change the global root folder in the fileserver manually and migrate all existing files to the new place?

thanks for opening an issue for the change of storage path after project changes.

Assuming you're using docker-compose to deploy the server, there are two lines, mapping a volume mount to the external folder containing the fileserver files that need to be changed:

In both, you'll need to change the left-hand-side of the mount to the folder of your choice. To move the files there, just take the server down, copy (or move) the files to the new folder, change the docker-compose files to reference this new folder instead of /opt/clearml/data/fileserver and than start the server.

Of course, I would recommend copying instead of moving, just so you'll have a backup of the old state in case something goes wrong.

Thanks. It works.