littlekernel / lk

LK embedded kernel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exception occurs when SVE instructions used on aarch64

schultetwin1 opened this issue · comments

SVE (Scalable Vector Extension) is an extension to the aarch64 architecture. It was an optional extension introduced in ARMv8.2-A and became built in for ARMv9-A. LK currently does not disable the traps for SVE use meaning you will get an exception if you use SVE instructions.

This is particularly a problem if you use an SVE instruction before VBAR_EL1 has been configured. As happens in LK's initialization code. The stack trace being

0: memset
1: init_thread_struct
2: thread_init_early
3: lk_main

Clang/LLVM will insert an SVE instruction into LK's implementation of memset when optimizing code.

To work around this, we have added the following two compilation flags to clang.

  • -fno-vectorize
  • -fno-slp-vectorize

This workaround works great for us because we don't need to use any SIMD instructions (SVE is a type of SIMD). I'm opening this issue as an FYI in case others run into it.

Below is a list of what I believe needs to be done to get SVE instructions working in LK in case anyone needs to do this. This has not been tested and there maybe more steps.

  1. Disable the EL2 coprocessor traps for SVE in arm64_el3_to_el1

    lk/arch/arm64/asm.S

    Lines 56 to 58 in ca633e2

    /* disable EL2 coprocessor traps */
    mov x0, #0x33ff
    msr cptr_el2, x0

    That code should (I believe) read:
    mov x0, #0x333ff
    msr cptr_el2, x0

In order to set the ZEN bits in CPTR_EL2. The same change needs to be made in arm64_elX_to_el1.

  1. Disable the SVE traps in EL1

    lk/arch/arm64/asm.S

    Lines 60 to 62 in ca633e2

    /* disable EL1 FPU traps */
    mov x0, #(0b11<<20)
    msr cpacr_el1, x0
    mov x0, #((0b11<<20) | (0b11 << 16))
    msr cpacr_el1, x0

In order to set the ZEN bits in the CPACR_EL1 register. This also needs to be done in arm_reset

  1. Properly configure the ZCR_Elx control registers

  2. Update arm64_fpu_exception to handle SVE exceptions as well for the lazy loading of SVE registers (I'm not sure if they overlap with the FPU registers or not).

  3. Update `arm64_fpu_pre_context_switch to properly lazy save SVE registers