flame / blis

BLAS-like Library Instantiation Software Framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Haswell config segfaults on macOS with gcc and CFLAGS="-g -O0"

devinamatthews opened this issue · comments

The perfect storm happens:

  1. On macOS
  2. With gcc (in my case 11.2)
  3. With debugging enabled (CFLAGS="-g -O0")
  4. Using the haswell configuration

What occurs in my case is that bli_saxpyv_zen_int10 is compiled with flags for AVX, and GCC makes the assumption that the incoming stack pointer is 32-byte aligned. Then, because of the debug flags, horrible code is emitted which spills vector registers (__m256) to the stack. This spilling uses aligned move instructions which require 32-byte alignment. GCC dutifully 32-byte aligns all offsets from the stack pointer to satisfy this constraint. Here is the catch: if the incoming stack point is NOT actually 32-byte aligned (as it isn't in BLIS, since the calling code is not compiled for AVX but for general x86_64), then we encounter a segfault.

It has proven quite difficult to actually force GCC to align the stack using e.g. andq $-32, %rsp. The only thing that seems to work is -mincoming-stack-boundary=3 (not 4!). I will add this to the default haswell flags since so far I can only confirm the problem on a mac. I'll see about Linux shortly.

Linux seems unaffected. Windows is also unaffected according to the internet.

I'll leave this open as a warning to others.