TODO (**Optimize, potentially using new VQSort PartialSort**)
enum-class opened this issue · comments
I have a question about this TODO (Optimize, potentially using new VQSort PartialSort) in here:
I want to do it but I'm struggling to find a clean solution. Can you help me out?
Initially, it seems VQSelect
is just enough since create_distribution
doesn't need sorted probabilities.
One idea is to create an array of key-value pairs (something like K32V32
) from the probabilities and their indexes, then apply VQSelect
and pass the first 'k' elements to 'create_distribution'. But this involves allocating and copying a potentially large probabilities array and requires a special structure for comparison, something like OrderDescendingKV64
.
Another idea is to create a special version of VQSelect just for this case.
Or simply leave the code as it is. What do you think?
Thanks for considering this! I think it's fairly low on the profile, so let's focus on other things first, in particular the prefill batching and matmul. I'm working on a plan for those and will post an issue soon with a proposed roadmap :)