[ENHANCEMENT]: Perf guide
sleeepyjack opened this issue · comments
Is your feature request related to a problem? Please describe.
cuCollections exposes a set of knobs that allow optimizing a hashing data structure for a specific use case.
For example:
- which probing scheme should I use?
- what's the best CG size?
- how does the input data type affect performance?
- can I use particular operations concurrently? How does that impact performance?
The interaction between those choices is also non-trivial.
Finding out which combination works best for an application is a time-consuming task.
Describe the solution you'd like
Write a perf guide. Could be as simple as a Markdown file.
Describe alternatives you've considered
No response
Additional context
No response
We do provide performance guidance in the probing sequence doc, e.g.:
cuCollections/include/cuco/probe_sequences.cuh
Lines 26 to 28 in 4bdf606
cuCollections/include/cuco/probe_sequences.cuh
Lines 52 to 55 in 4bdf606
Having a performance tuning section in README
doesn't seem right.
Right. This would be too mich information for a readme. I would put it in a separate file and link to it from the readme.