sandialabs / compadre

Compadre (Compatible Particle Discretization and Remap)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CTests Time Out with -j > 4

jmgate opened this issue · comments

I'm just trying to get up and running with Compadre, and the first time I tried to test it, I saw a bunch of tests timing out. By default I use ctest -j 24, so I started backing that down and got everything to pass with ctest -j 4. Has anyone else run into problems along these lines before? I'm concerned that as we're working toward snapshotting this into Trilinos (#164), we'll also need to revamp the testing such that it can run with higher -j.

Hi @jmgate, do you actually have 24 cores available? The timeouts are set high enough that they should pass if a user actually has that many cores available. "ctest -j" with no number should just use the number of cores available and still pass.

I have 32 cores, hyperthreaded, so 64 logical cores. Running ctest -j looks like it's defaulting to -j 1.

Yes, "ctest -j" does look like N=1. Each of the tests has serial portions and parallel portions. The parallel portions will be slowed down if you run with that many cores (compared to how fast it would run if allowed to spawn more threads), but the serial portions are then parallelized. We can increase the timeouts without causing any harm.

Or if the guidance is simply to use ctest -j instead of ctest -j X, that's fine. I just wanted to be sure I could accurately test Compadre before I start working on it.

From @kuberry:

You could use "OMP_NUM_THREADS=1 ctest -j 24" and the tests may finish in time on your computer. When we someday switch to only using Kokkos-Kernels, that environment variable won't be needed.

I'm not actually concerned with the speed of the tests. Rather, I know the typical way to test Trilinos is with ctest -j X, and if that doesn't work out of the box, there may be additional work needed there before we can get Compadre into Trilinos.