google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not all aarch64 support 8.2 instructions

rfried-nrl opened this issue · comments

Build assumes that aarch64 is armv8.2, but ARM a53, and a72 are armv8.0
IF(XNNPACK_TARGET_PROCESSOR MATCHES "^(aarch64|arm64)$" OR IOS_ARCH MATCHES "^arm64.*") SET_PROPERTY(SOURCE ${XNNPACK_AARCH64_NEONFP16ARITH_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -march=armv8.2-a+fp16 ") SET_PROPERTY(SOURCE ${XNNPACK_NEONDOT_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -march=armv8.2-a+dotprod ") SET_PROPERTY(SOURCE ${XNNPACK_AARCH64_ASM_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS " -march=armv8.2-a+fp16+dotprod ")
This cause the following error when compiled using a53 cross-compiler:
| cc1: warning: switch '-mcpu=cortex-a72.cortex-a53' conflicts with '-march=armv8.2-a+fp16+dotprod' switch | xnnpack/src/f16-gemm/gen-inc/6x8inc-minmax-aarch64-neonfp16arith-ld64.S: Assembler messages: | xnnpack/src/f16-gemm/gen-inc/6x8inc-minmax-aarch64-neonfp16arith-ld64.S:125: Error: selected processor does not support fmla v20.8h,v16.8h,v0.h[0]'
| xnnpack/src/f16-gemm/gen-inc/6x8inc-minmax-aarch64-neonfp16arith-ld64.S:126: Error: selected processor does not support fmla v22.8h,v16.8h,v1.h[0]'

Don't build with custom -mcpu flags. XNNPACK builds microkernels for all variants of AArch64, and choose which one to use in runtime.

Thanks. That's Interesting, I'm working on integrating Tensorflow lite to Yocto build, and it passes this.