komihash 5.0
avaneev opened this issue · comments
Hi!
Version 5.0.
Simplified handling of the "final byte", yielding 0.5 cycles/hash improvement.
Rearranged Loop64 memory addresses, yielding 0.5 GB/s large-block hashing improvement.
Note that the output values of the function changed.
branch komihash5
updated to v5.1, very minor change
Please also update the code size, it may be lower now.
Bulk performance is still 12.3GB/s, for no technically understandable reason, and laughably close to the new polymur-hash. Should really be about 19 GB/s...
Bulk performance is still 12.3GB/s, for no technically understandable reason
Perhaps there is not enough research into this topic. I myself do not know of any reliable way how to predict performance (non-)contributions (e.g. in percent) of certain programming "tuples" of operations/instructions, patterns, and techniques.
Perhaps I should try some very fine-grained complexity metrics to see if there is any correlation.
actually the size is larger now
objdump -dC build/SMHasher |less
25830 - 25d5b: 1323
done
Thanks!
Here are komihash
test results on a large variety of platforms: https://bench.cr.yp.to/results-hash.html
Reini's compiler is likely misconfigured.
I've found out that it's GCC which creates a much slow 64-byte hashing code on Zen platform. With Clang, or GCC on Intel platforms there are no issues. This does not affect small-string timings, though.
Strangely enough, the komihash_stream_oneshot() function does perform as expected with GCC on Zen. There's some issue with the compiler on this code.