NGSolve / ngsolve

Netgen/NGSolve is a high performance multiphysics finite element software. It is widely used to analyze models from solid mechanics, fluid dynamics and electromagnetics. Due to its flexible Python interface new physical equations and solution algorithms can be implemented easily.

Home Page:https://ngsolve.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pyngcore.SetNumThreads() does not appear to be respected

Alex-Vasile opened this issue · comments

I was expecting that code wrapped in a with TaskManger: would respect the number of threads set using pyngcore.SetNumThreads(), but that does not appear to be true in all cases.

I have attached a file to reproduce the issue: infinite_loop.txt

You will have to rename it from infinite_loop.txt to infinite_loop.py since GitHub won't let me upload a .py file.

Running

The python file takes one argument, a string that's either 'direct' or 'cg'.

To run only threaded, i.e. with NGSolve's TaskManger: run the file with python (python3 infinite_loop.py X, where X is either 'direct' or 'cg').
It has been configured to run with only one thread (pyngcore.SetNumThreads(1)).

To run with mpi: call mpiexec -n 1 python3 infinite_loop.py X, where X is either 'direct' or 'cg', so that it runs with 1 core in order to provide a direct comparison to TaskManager.

Behaviour

Expected

Using either TaskManager or MPI with 1 thread it is expected that the average CPU usage of the program hover ~100% (unix style measurement where 100% is one thread at 100% capacity). It was expected that this be true for both the 'direct' solve and for CG.

Actual

When running with CG, the CPU usage remained at ~100% for the loop regardless of whether it was run with TaskManager or MPI.

However, when the 'direct' solve was used, there was a large discrepancy between TaskManager and MPI.
When run with MPI, the average CPU usage fluctuated between 120% and 200% (greater than the expected 100%).
When run with TaskManger the CPU usage fluctuated between 300% and 800% (I have an 4 core / 8 thread machine, so this represents using the entirety of my resources).

The multithreading you observer is due to the direct solver (which one, depends on your build configuration).
pyngcore.SetNumThreads() only affects multithreading by the TaskManager(), and does not inhibit other libraries to create threads.
In your case, try to experiment with the environment variables OMP_NUM_THREADS and/or MKL_NUM_THREADS.

Best,
Matthias

I see. Thanks for the update.

Follow up question. Why doesn't SetNumThreads change those environmental variables?

Examples such as this one on code block 8, give the impression that all of the code inside the 'with Taskmanager():' would use the same number of threads