(3.9.0-3.9.1) Default ThreadsPerCore Slurm setting causes reduced CPU utilization
nihitsaxena4 opened this issue · comments
Bug description
ParallelCluster does not explicitly set the ThreadsPerCore for compute node configuration causing Slurm to use the default value of 1. Slurm v23.11 introduced a change that requires the ThreadsPerCore setting to match the threads per physical core of the underlying instance. For compute resources that support hardware multi-threading and it has not been disabled, this will result in CPU under utilization at around 50% (Slurm will never allocate to the secondary virtual cores).
Affected versions (OSes, schedulers)
- ParallelCluster 3.9.0, 3.9.1
- Slurm 23.11.4
- All operating systems supported by ParallelCluster
Mitigation
You can find a detailed explanation and the mitigation of the problem here.