Parallel hashes crash when built with AVX2 on Windows.
damaki opened this issue · comments
Description:
When libkeccak is built on Windows with AVX2 instructions enabled (ARCH=x86_64 and SIMD=AVX2) parallel hashes (KangarooTwelve, ParallelHash, etc) crash with a Program_Error
raised with EXCEPTION_ACCESS_VIOLATION
as the message when processing input data or generate output data. This only occurs if the data input/output buffer is large enough to trigger the usage of AVX2 instructions.
The problem has only been observed on Windows. Builds on Linux using the same version of the compiler (GNAT Community 2019) are confirmed to be working at the time of writing.
Steps to reproduce:
Compiler version: 64-bit GCC 8.3.1 20190518 (for GNAT Community 2019 20190517)
Operating system: Windows
- On Windows, run
make test ARCH=x86_64 SIMD=AVX2
- The crash occurs when the tests are run.
Workaround:
The workaround is to avoid building libkeccak with AVX2 on Windows. Instead, use SSE2 instructions only, i.e. build libkeccak with ARCH=x86_64 SIMD=SSE2
. This will result in slightly lower performance compared to AVX2, but is still pretty fast and at least it doesn't crash.
Root cause:
The root of the problem is that GCC is not respecting the requested 32-byte alignment on objects of type Keccak.Arch.AVX2.V4DI_Vectors.V4DI
allocated on the stack, but is still generating AVX2 instructions (i.e. vmovdqa
) which assume 32-byte alignment. This attempt to load/store misaligned data on the stack is causing the segfault in the AVX2 instantiations of Keccak.Generic_Parallel_Keccakf.Permute_All
.
By contrast, on Linux GCC adjusts the stack pointer to ensure it is 32-byte aligned with the following disassembly:
0x000000000040ecf0 <+0>: push %rbp
0x000000000040ecf1 <+1>: mov $0x432540,%eax
0x000000000040ecf6 <+6>: mov $0x4326c0,%edx
0x000000000040ecfb <+11>: mov %rsp,%rbp
0x000000000040ecfe <+14>: and $0xffffffffffffffe0,%rsp
0x000000000040ed02 <+18>: sub $0x368,%rsp
The disassembly of the same function when built on Windows with the same version of GNAT does not align the stack pointer:
0x0000000000452e50 <+0>: sub $0x468,%rsp
This seems to be a known bug in 64-bit GCC Windows, judging by the following links:
GCC bug 54412 is also relevant: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412
Closing this since this is a GCC bug and is outside the scope of this library. The top-level README.md was updated in #18 to add a warning that AVX2 is not guaranteed to work on Windows with a reference to the GCC bug.