NVIDIA / cuCollections

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ENHANCEMENT]: Add XXHash

sleeepyjack opened this issue · comments

Is your feature request related to a problem? Please describe.

The current default hasher murmur3_32 is computational expensive and might lead to suboptimal performance in libcudf's hash join for when the build table is small enough so that it fits in L1 or L2$ (see rapidsai/cudf#10587).

Describe the solution you'd like

Libcudf has an open feature request (see rapidsai/cudf#12829), which proposes to use a faster hash function, namely XXHash. I propose adding XXHash as a second hasher to cuco and subsequently compare performance against murmur3_32.

Describe alternatives you've considered

No response

Additional context

No response

Awesome @sleeepyjack! Feel free to tag me for review if you'd like. I'd be very interested in seeing what libcudf can improve here.