Explore the idea of peeking SIMD data
ebassi opened this issue · comments
It might be possible to return a pointer to the start of a graphene_simd4_t
(or graphene_simd4x4f_t
) to functions that expect an array of floating point values; this would allow removing a stack allocation when all we care about is passing a bunch of floats to, say, GL.
Experiments on x86_64
seem to yield positive results, but it could be a combination of recent compilers and specific SIMD types, so this would require further investigation:
- does passing the reference to an
__m128
or afloat32x4
type actually lead to a SIMD register read? - if the read happens, is it dependent on the OS?
- if the read happens, is it dependent on the type or version of the compiler?