Remove the memcopies of the vectors

Question

Kerollmops opened this issue 8 months ago · comments

It is the part that takes up to 22% to copy, 5% to bzero the vector in advance, and again 5% to drop the allocated vectors. It seems like the AVX implementation can be switched to non-aligned f32 slices.