allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

Home Page:https://clear.ml/docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to clean-up local storage on agent hosts

thaikoh opened this issue · comments

We have multiple agents running tasks from a server. On agent host in directory /home/user/.clearml/cache/storage_manager/datasets we have following files and directories:

➜  datasets ll
total 63M
-rw-r--r-- 1 root root 1.7M Jan 23 16:17 0657a6f81e67d7dec63928aeec239f91.state.json
-rw-r--r-- 1 root root  22M Jan 31 23:40 14ce4f985c9da0b144bc3c4ecf439dd9.state.json
-rw-r--r-- 1 root root 1.1M Jan 31 23:47 342f25d588034d5dfb0abb6d9d9f470a.state.json
-rw-r--r-- 1 root root 3.3M Jan 31 23:46 49ff4e76b0413dc8866ac0b8d652df09.state.json
-rw-r--r-- 1 root root 5.3M Nov 21 17:47 536335d05cd8190952c1c70739708be7.state.json
-rw-r--r-- 1 root root 1.4M Jan 31 23:46 b918955acbc5f2433a1b46a8af36d7cd.state.json
-rw-r--r-- 1 root root 5.2M Nov 22 13:57 c105ce6fef8921bcd84fbb4d04d7a65f.state.json
-rw-r--r-- 1 root root 1.6M Jan 31 23:46 d4673bcd031c059c3ae8412bb7b60bee.state.json
drwxr-xr-x 3 root root 4.0K Nov 22 13:57 ds_0f2a68f7cb3c498789764f4bedf310f8
drwxr-xr-x 3 root root 4.0K Jan 23 16:15 ds_39435110f9bd432ba0dc3e468598382c
drwxr-xr-x 3 root root 4.0K Jan 12 12:30 ds_4b8939e8dc654c7f8fdd9a1341a3d57a
drwxr-xr-x 3 root root 4.0K Jan 23 16:15 ds_68ebd46cbb2f4b73bf55704d23352c51
drwxr-xr-x 3 root root 4.0K Jan 31 23:46 ds_7e76a7fdb92544759970ac4d179cd2e9
drwxr-xr-x 3 root root 4.0K Jan 31 23:41 ds_82833df6861f4ab592ec9da0920c4ddc
drwxr-xr-x 3 root root 4.0K Nov 21 17:47 ds_9e8651212dbf4be088e540d1db3df092
drwxr-xr-x 3 root root 4.0K Jan 31 23:46 ds_a9517a01182b41818789363c02971c2d
drwxr-xr-x 3 root root 4.0K Jan 31 23:47 ds_ccf94b267bb24755b01e27fcb64e3889
drwxr-xr-x 3 root root 4.0K Jan 31 23:46 ds_eb29b16f746b4f5eba984a315976fc64
-rw-r--r-- 1 root root 1.2M Jan 23 16:17 e396d98f7fda58d0771851fa487708fb.state.json
-rw-r--r-- 1 root root  22M Jan 12 12:30 e5c7192b0eb6a9c4422a1e6c1d35686e.state.json

Total size is tens of gigabytes on each agent host, so the question is how to safe clean-up these files?

@thaikoh ClearML has a built-in cache total files setting which allows you to control the maximum number of files in the cache. If you'd like to clean up files there yourself, you can probably do that by access time