`fn Rav1dRefmvsDSPContext::splat_mv` is slow
fbossen opened this issue · comments
When processing the "chimera 8-bit" bitstream, the decoder spends about 3% of its time in fn Rav1dRefmvsDSPContext::splat_mv
. There is no equivalent C function. Also, this is much more time than what is spent in the dav1d_splat_mv_*
assembly functions that do the actual data processing.
Let me look into this. We should hopefully be able to make it faster but a bit less safe.
Okay I think I figured how to eliminate most of the overhead (basically only compute the r
ptr for the elements we're using). I'll open a PR soon.