pyGIMLi and NUM_THREADS
prisae opened this issue · comments
Problem description
I can try to ensure that each of my processes only use one thread by setting my environment variables accordingly,
export OPENBLAS_NUM_THREADS=1
export MKL_NUM_THREADS=1
export OMP_NUM_THREADS=1
export NUMBA_NUM_THREADS=1
export NUMEXPR_NUM_THREADS=1
export NUM_THREADS=1
This works fine. UNTIL I do a simple import pygimli
- this fiddles with my settings, and sets my processes use several hundred % of CPU, which can be annoying on shared clusters.
Your environment
Please provide the output of print(pygimli.Report())
here. If that does not
work, please give provide some additional information on your:
Operating system: Linux (RHEL 8.9)
Python version: e.g. 3.9, 3.10, etc.?
pyGIMLi version:
--------------------------------------------------------------------------------
Date: Thu Jun 06 10:30:59 2024 CEST
OS : Linux
CPU(s) : 256
Machine : x86_64
Architecture : 64bit
RAM : 1006.8 GiB
Environment : Jupyter
File system : ext4
Python 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37)
[GCC 12.3.0]
pygimli : 1.4.5
pgcore : 1.4.0
numpy : 1.24.4
matplotlib : 3.8.2
scipy : 1.11.4
tqdm : 4.66.1
IPython : 8.18.1
pyvista : 0.43.1
Intel(R) oneAPI Math Kernel Library Version 2023.2-Product Build 20230613
for Intel(R) 64 architecture applications
--------------------------------------------------------------------------------
Way of installation: conda
Steps to reproduce
Set your environment variables of all *NUM_THREADS
to one, and observe that with pygimli
it uses more than 100% CPU.
Expected behavior
I would expect either of
pygimli
respecting user defined env variablespygimli
indicating when changing env variables, and providing a way to disable it
This circumvents the issue (mostly):
import pygimli as pg
pg.setThreadCount(1)
but still, I would not expect an import to mess with my variables.
For conveniance reasons, the core extension of pygimli sets OPENBLAS_NUM_THREADS
to number of cpu -2 right on initializing. You can change it back after importing pygimli with pg.setThreadCount(1)
We maybe could change this that he only sets this environment variable, if its not already specified by the user?
I see. But when the number of CPU is 256 on a shared cluster, that is a very inconvenient default IMHO.
I would prefer what you say afterwards. IF there is a user set env variable, it should be respected.
And maybe also a MAX (no need to set the nthread to 254).
Change default behaviour. Will be live after the next core update.
export OPENBLAS_NUM_THREADS=12 && python -c 'import pygimli as pg; print(pg.core.threadCount())'
12
unset OPENBLAS_NUM_THREADS && python -c 'import pygimli as pg; print(pg.core.threadCount())'
16
Great, like it, thanks for the quick turnaround! 8 or 16 was exactly what our HPC expert here also suggested as maximum, as openblas won't be much efficient beyond that number.