NVIDIA / libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

Home Page:https://nvidia.github.io/libcudacxx

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update semantics of cuda::atomic::fetch_min/max to RMW following P0493

gonzalobg opened this issue · comments

Currently, cuda::atomic::fetch_{min,max} are semantically equivalent to a read followed by a call to min / max; there is no write.

This does not add much value over just explicitly performing the load followed by the min/max.

It does add a subtle API inconsistency, since the other fetch_ operations are RMW.

It also prevents porting atomicMax/atomicMin to cuda::atomic, and emitting red instructions.

P0493 proposes these as RMW. We should implement them that way.

This has been fixed in #197.