When a job is composed of a bench of mpi processes, the effeciency you are computing is false

Question

When a job is composed of a bench of mpi processes, the effeciency you are computing is false

amalkhabouHQ opened this issue a year ago · comments

Hello,
I am using your tool report. I wanted to mention that when a job is composed of x mpi processes (36 in my example below), your computation is wrong

reportseff 6196869
    JobID    State          Elapsed  TimeEff   CPUEff   MemEff
  6196869  COMPLETED    12-14:16:39    ---      5.0%     0.9%
 sacct -P -n -a --format JobID,State,AllocCPUS,REQMEM,TotalCPU,Elapsed,MaxRSS,ExitCode,NNodes,NTasks -j  6196869
6196869|COMPLETED|720|191846Mn|451-06:00:24|12-14:16:39||0:0|20|
6196869.batch|COMPLETED|36|191846Mn|451-06:00:24|12-14:16:39|33824748K|0:0|1|1

My cpus are running at 100% or you are reporting 5%, the CPUEff should be 451,4/12,52/36 since the 36 mpi processes are running in parallel which corresponds to 100%

Troy Comi · Answer 1 · Fri Aug 11 2023 23:41:41 GMT+0800 (China Standard Time)

Can you run reportseff --debug 6196869 and post here?

Just to be sure, you have 36 cores and 20 nodes? Seems like I'm dividing by the 720 CPUs.

Finally, does the memory efficiency look ok or is it also off by a factor of 20?

Thanks for bringing this up, I usually don't parallelize over multiple nodes.

amalkhabouHQ · Answer 2 · Sat Aug 12 2023 00:00:20 GMT+0800 (China Standard Time)

reportseff --debug 6196869
^|^720^|^12-14:16:39^|^6196869^|^6196869^|^^|^20^|^191846Mn^|^COMPLETED^|^UNLIMITED^|^451-06:00:24
^|^36^|^12-14:16:39^|^6196869.batch^|^6196869.batch^|^33824748K^|^1^|^191846Mn^|^COMPLETED^|^^|^451-06:00:24

Yeah i am using 20 nodes each with 36 CPUs, I think the problem concerns the memory effeciency as well but I will double check

Troy Comi · Answer 3 · Wed Aug 30 2023 05:23:57 GMT+0800 (China Standard Time)

It seems you only use one node in your job based on the sacct output. Can you ssh into a few nodes and check their cpu usage while you run your code? I want to be sure before I close this.

The TotalCPU is for all cpus on all nodes (451 days) for your AllocCPUS on all nodes (720). So you are using 451 / (720 * 12.5 [elapsed]) = 5% of the cpu time.

amalkhabouHQ · Answer 4 · Thu Aug 31 2023 22:37:02 GMT+0800 (China Standard Time)

I just double checked on a running job with the htop on the different nodes and all the CPUs are running at 100%

Troy Comi · Answer 5 · Thu Aug 31 2023 23:04:20 GMT+0800 (China Standard Time)

What version of slurm are you running?
Can you check the output of seff?
Did you truncate the debug output?

Based on the second line of sacct (6196869.batch) it's only showing the one node running, not sure exactly what's happening.