Haswell config segfaults on macOS with gcc and CFLAGS="-g -O0"
devinamatthews opened this issue · comments
The perfect storm happens:
- On macOS
- With gcc (in my case 11.2)
- With debugging enabled (
CFLAGS="-g -O0"
) - Using the
haswell
configuration
What occurs in my case is that bli_saxpyv_zen_int10
is compiled with flags for AVX, and GCC makes the assumption that the incoming stack pointer is 32-byte aligned. Then, because of the debug flags, horrible code is emitted which spills vector registers (__m256
) to the stack. This spilling uses aligned move instructions which require 32-byte alignment. GCC dutifully 32-byte aligns all offsets from the stack pointer to satisfy this constraint. Here is the catch: if the incoming stack point is NOT actually 32-byte aligned (as it isn't in BLIS, since the calling code is not compiled for AVX but for general x86_64), then we encounter a segfault.
It has proven quite difficult to actually force GCC to align the stack using e.g. andq $-32, %rsp
. The only thing that seems to work is -mincoming-stack-boundary=3
(not 4!). I will add this to the default haswell
flags since so far I can only confirm the problem on a mac. I'll see about Linux shortly.
Linux seems unaffected. Windows is also unaffected according to the internet.
I'll leave this open as a warning to others.