HYBRID_MOMENTUM does not work with HIP/ROCm
zingale opened this issue · comments
Michael Zingale commented
If we build wdmerger
on Frontier via:
make COMP=gnu USE_HIP=TRUE TINY_PROFILE=TRUE -j 4
the we get memory access errors during the initialization:
Initializing the data at level 0
Memory access fault by GPU node-4 (Agent handle: 0x37c48c0) on address (nil). Reason: Page not present or supervisor privilege.
SIGABRT
See Backtrace.0 file for details
MPICH ERROR [Rank 0] [job id 1318117.0] [Thu May 11 12:26:27 2023] [frontier02151] - Abort(6) (rank 0 in comm 0): application called
MPI_Abort(MPI_COMM_WORLD, 6) - process 0
This is happening in the call to ``inear_to_hybrid_momentum` after we fill the state.
Note: this occurs even if we run with castro.hybrid_hydro=0
, since we don't skip these calls (and also the construction of the hydro source).
If I compile with USE_HYBRID_MOMENTUM=FALSE
then the problem runs fine.
Michael Zingale commented
The problem seems to be these lines in linear_to_hybrid_momentum
:
for (int dir = 0; dir < 3; ++dir)
u(i,j,k,UMR+dir) = hybrid_mom[dir];