ogxd / gxhash

The fastest hashing algorithm 📈

Home Page:https://docs.rs/gxhash

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use aligned loads in get_partial_safe

bill-myers opened this issue · comments

In get_partial_safe, it's possible to use aligned loads by declaring the buffer as MaybeUninit<State> and then casting the pointer for std::ptr::copy and zeroing the rest of the buffer, instead of declaring the buffer as an u8 array.

let mut buffer = [0i8; VECTOR_SIZE];
// Copy data into the buffer
std::ptr::copy(data as *const i8, buffer.as_mut_ptr(), len);
// Load the buffer into a __m256i vector
let partial_vector = _mm_loadu_epi8(buffer.as_ptr());

From what I can see there is a compiler optimization that stack allocates [0i8; VECTOR_SIZE] instead of heap allocating (probably because VECTOR_SIZE is a constant), so MaybeUninit<State> may not be faster.

About to close this one unless someone has some snippet to propose?

I tried using a struct which contains only the byte array and is marked as #[repr(align(16))]. I have not tested the performance yet, but this should still allocate on the stack and force 16-byte alignment.

Closing this as proposed solution does not provides significant performance gains nor simplifies the code. Feel free to open another issue if you have something to suggest.