OpenACC will only run on CPU with GCC
jeffhammond opened this issue · comments
OpenACC uses GNU __atomic
not #pragma acc atomic
and the CMake assumes GCC's implementation.
I will try to fix this.
@jeffhammond, thanks for the find!
Agreed. Thanks for the find! I have been giving this some thought as well.
To my thinking, it would be ideal (from a portability standpoint) to decouple CircusTent from the GNU specific atomics altogether. Currently, the pthreads, OpenMP, and OpenACC backends all utilize them.
For pthreads we could use the C11 standard, but I am not entirely sure if that equates to a step forward or backward with respect to portability.
For OpenMP & OpenACC, we should be able to replace the add atomic operations with #pragma directive versions, but I'm not sure the compare-and-swap are doable in the same manner. If not, we could just omit the CAS variations from these backends. Alternatively, it looks like the "capture" clause in these models might support an unconditional atomic swap.
@jeffhammond @jleidel What do you guys think?
I think you want to keep Pthreads + GCC intrinsics as a generic implementation on CPUs.
What I'd add is a true C++11 (std::thread
plus std::atomic
) or C++17/20 (std::for_each(std::execution::par_unseq...
plus std::atomic_ref
) implementation. The latter will work on CPU and GPU (once std::atomic_ref
is available). If you do the C++11 one, you can do the C11 one too, with minimal additional effort.
Granted, I find std::thread
annoying relative to Pthreads, and std::atomic
or _Atomic
means you need an array of those, unlike intrinsics or std::atomic_ref
, so maybe you just want to do the C++17/20 port.
I haven't read enough on OpenACC atomics to be sure about CAS. They used to not have it, but I think they do now.
In any case, you can decouple the implementation of atomics from the implementation of threads pretty easily. For example, look at https://github.com/jeffhammond/Quicksilver/blob/cxx20-atomics/src/AtomicMacro.hh.
@BrodyWilliams C11 threads used to be way less portable than Pthreads, but glibc finally got with the program in 2018 (https://sourceware.org/legacy-ml/libc-alpha/2018-08/msg00003.html).
Issues should be addressed by #17 . (C++11 implementation added, OpenACC & OpenMP+Target revised)