Documentation of warp-wide collectives refers to `__syncthreads` instead of `__syncwarp`
fkallen opened this issue · comments
For example in https://nvlabs.github.io/cub/class_warp_exchange.html#a078092b662bf8cdc67c2322d71f0a776
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the collective's temporary storage (e.g., temp_storage) is to be reused or repurposed.