dynamic rate limiting of job submissions?

Question

dynamic rate limiting of job submissions?

BenWibking opened this issue 5 months ago · comments

On the cluster I'm using, there is a hard limit of 36 jobs per user that are running or pending in the SLURM queue.

However, I need to run a 200 parameter study. Is there any workaround for this other than splitting this large study up into studies of <= 36 parameters?

It would be ideal if it were possible for the conductor process to wait until jobs complete and then submit new jobs.

Ben Wibking · Answer 1 · Sun May 05 2024 03:55:58 GMT+0800 (China Standard Time)

Following the route described in the docs here (https://maestrowf.readthedocs.io/en/latest/Maestro/how_to_guides/running_with_flux.html#launch-maestro-external-to-the-batch-jobflux-broker) seems like the best option for my use-case.

I've managed to install Flux via Spack on this cluser. The one remaining issue is that I have to wait until the SLURM job starts before I can do maestro run on the login node.

If I wanted to modify the Maestro conductor code so it polls SLURM to see whether the Flux broker job has started, where should I start to do that? Is this feasible?

Frank · Answer 2 · Sun May 05 2024 07:01:34 GMT+0800 (China Standard Time)

Hi @BenWibking -- one thing to note is that maestro run also has a throttle option, you could limit the jobs to 36 there. Do keep in mind that is a universal limit between local and scheduled steps. So if you have a lot of local steps ahead of submitted steps, you will artificially limit yourself there.

Ben Wibking · Answer 3 · Sun May 05 2024 09:47:30 GMT+0800 (China Standard Time)

Hi @BenWibking -- one thing to note is that maestro run also has a throttle option, you could limit the jobs to 36 there. Do keep in mind that is a universal limit between local and scheduled steps. So if you have a lot of local steps ahead of submitted steps, you will artificially limit yourself there.

Adding --throttle 36 solves the problem and works perfectly.

I was a bit thrown off by the wording in the documentation for the --throttle option. It might help to clarify that it refers to the total number of jobs in the (external, non-Maestro) scheduler queue (both running and pending), rather than only those that are actually executing.

Ben Wibking · Answer 4 · Mon May 06 2024 00:38:36 GMT+0800 (China Standard Time)

I checked the status of this study today and it seems to have stopped submitting new jobs to SLURM.

maestro status reports that several dozen steps are PENDING and dozens more are INITIALIZED, but nothing is in the SLURM queue. Maybe this is related to #441?

The last log entry is:

2024-05-05 11:40:01,492 - maestrowf.conductor:monitor_study:349 - INFO - Checking DAG status at 2024-05-05 11:40:01.492025
2024-05-05 11:40:01,597 - maestrowf.datastructures.core.executiongraph:check_study_status:963 - INFO - Jobs found for user 'bwibking'.
2024-05-05 11:40:01,598 - maestrowf.datastructures.core.executiongraph:execute_ready_steps:916 - INFO - Found 0 available slots...

This full log for this study is here:
medres_compressive.log.zip

The conductor process for this is still running:

login4.stampede3(1011)$ ps aux | grep $USER
bwibking 3092832  0.0  0.0  20660 11896 ?        Ss   May04   0:01 /usr/lib/systemd/systemd --user
bwibking 3092835  0.0  0.0 202568  6948 ?        S    May04   0:00 (sd-pam)
bwibking 3093950  0.0  0.0   7264  3472 ?        S    May04   0:00 /bin/sh -c nohup conductor -t 60 -d 2 /scratch/02661/bwibking/precipitator-paper/outputs/medres_compressive_20240504-193253 > /scratch/02661/bwibking/precipitator-paper/outputs/medres_compressive_20240504-193253/medres_compressive.txt 2>&1
bwibking 3093951  0.2  0.0 328808 72948 ?        S    May04   2:17 /scratch/projects/compilers/intel24.0/oneapi/intelpython/python3.9/bin/python3.9 /home1/02661/bwibking/.local/bin/conductor -t 60 -d 2 /scratch/02661/bwibking/precipitator-paper/outputs/medres_compressive_20240504-193253
root     3993349  0.0  0.0  39960 12012 ?        Ss   11:32   0:00 sshd: bwibking [priv]
bwibking 3993762  0.0  0.0  40144  7516 ?        S    11:33   0:00 sshd: bwibking@pts/73
bwibking 3993765  0.0  0.0  18048  6128 pts/73   Ss   11:33   0:00 -bash
bwibking 3998925  0.0  0.0  19236  3652 pts/73   R+   11:39   0:00 ps aux
bwibking 3998926  0.0  0.0   6432  2336 pts/73   S+   11:39   0:00 grep --color=auto bwibking

This seems to reliably happen for studies that I run on this machine.

Ben Wibking · Answer 5 · Tue May 07 2024 11:08:35 GMT+0800 (China Standard Time)

This issue seems to be the same as #441, and that has more informative logs, so I'll close this.