NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segmented sorting does not preserve data in-between segments.

isovic opened this issue · comments

Hi,
CUB is an amazing asset and I find it very useful.

There is one thing I just discovered regarding segmented sorting, concretely tested on cub::DeviceSegmentedRadixSort::SortPairs.

Since this algorithm does not work in-place (which is of course a hard thing to do on the GPU), it generates output in a new array.
But as it turns out, if input segments are not back-to-back contiguous (for example, if the data was padded and padding is not to be sorted; or if the data in the segments was filtered and the segments become shorter), then CUB doesn't copy the data located in-between segments into the output array.
This means that data gets lost going from input to output. The parts of data in-between segments are uninitialized by the sorting routines as far as I can tell, and equal to previous state of memory in those locations.
This isn't clear from the documentation.

I see two potential solutions:

  1. Have CUB copy the regions in between segments into the output array. Or,
  2. Document it clearly in the docs, so that the users know that they need to copy the data themselves.

Of course, I'd clearly prefer option number 1, but both are fine 🙂

Thank you,
Best regards,
Ivan.

Hi @isovic

I am a bit unsure, whether I understand your question correctly.

Do you have data that you cast to a subclass that does not include the padding bytes, or do you have a single type you pass to radix sort and it does not properly copy it?

Hi @miscco,

Let me see if I can create a small working example and share it.

Ivan

The reason of me asking is that padding bytes are very deep into nasal demon territory and generally there are no guarantees we can give there

Hello @isovic!

Our segmented facilities guarantee that the data outside of segments description is not modified. So the second suggested solution is already in place.

This is great, thanks!
The location of that note is not easy to spot if one doesn't know to look there.
Can you add it to the official docs page somewhere?
https://nvlabs.github.io/cub/structcub_1_1_device_segmented_radix_sort.html#ab6623fbbaab619bd2401ff594c860bed

(I see it's Doxygen, but I don't see this sentence appearing on the above link.)

@miscco I won't be making the example then in this case.

Best regards,
Ivan.

@isovic thank you for the comment! We are currently updating our documentation, so the rendered version is outdated.