damaki / libkeccak

SHA-3 and other Keccak related algorithms in SPARK/Ada.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parallel hashes crash when built with AVX2 on Windows.

damaki opened this issue · comments

Description:
When libkeccak is built on Windows with AVX2 instructions enabled (ARCH=x86_64 and SIMD=AVX2) parallel hashes (KangarooTwelve, ParallelHash, etc) crash with a Program_Error raised with EXCEPTION_ACCESS_VIOLATION as the message when processing input data or generate output data. This only occurs if the data input/output buffer is large enough to trigger the usage of AVX2 instructions.

The problem has only been observed on Windows. Builds on Linux using the same version of the compiler (GNAT Community 2019) are confirmed to be working at the time of writing.

Steps to reproduce:
Compiler version: 64-bit GCC 8.3.1 20190518 (for GNAT Community 2019 20190517)
Operating system: Windows

  1. On Windows, run make test ARCH=x86_64 SIMD=AVX2
  2. The crash occurs when the tests are run.

Workaround:
The workaround is to avoid building libkeccak with AVX2 on Windows. Instead, use SSE2 instructions only, i.e. build libkeccak with ARCH=x86_64 SIMD=SSE2. This will result in slightly lower performance compared to AVX2, but is still pretty fast and at least it doesn't crash.

Root cause:
The root of the problem is that GCC is not respecting the requested 32-byte alignment on objects of type Keccak.Arch.AVX2.V4DI_Vectors.V4DI allocated on the stack, but is still generating AVX2 instructions (i.e. vmovdqa) which assume 32-byte alignment. This attempt to load/store misaligned data on the stack is causing the segfault in the AVX2 instantiations of Keccak.Generic_Parallel_Keccakf.Permute_All.

By contrast, on Linux GCC adjusts the stack pointer to ensure it is 32-byte aligned with the following disassembly:

   0x000000000040ecf0 <+0>:    push   %rbp
   0x000000000040ecf1 <+1>:    mov    $0x432540,%eax
   0x000000000040ecf6 <+6>:    mov    $0x4326c0,%edx
   0x000000000040ecfb <+11>:    mov    %rsp,%rbp
   0x000000000040ecfe <+14>:    and    $0xffffffffffffffe0,%rsp
   0x000000000040ed02 <+18>:    sub    $0x368,%rsp

The disassembly of the same function when built on Windows with the same version of GNAT does not align the stack pointer:

   0x0000000000452e50 <+0>:     sub    $0x468,%rsp

This seems to be a known bug in 64-bit GCC Windows, judging by the following links:

Closing this since this is a GCC bug and is outside the scope of this library. The top-level README.md was updated in #18 to add a warning that AVX2 is not guaranteed to work on Windows with a reference to the GCC bug.