DLTcollab / sse2neon

A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

platform aarch64, cannot initialize paramter of float32x4_t with __m128d

mexuaz opened this issue · comments

commented

Clang 14.0.7 (Android NDK 25.2) produces the following error when cross-compiling from Ubuntu 22 for the aarch64 target.

sse2neon.h:5457:33: error: cannot initialize a parameter of type 'float32x4_t' (vector of 4 'float32_t' values) with an lvalue of type '__m128d' (aka 'float64x2_t')
      __builtin_nontemporal_store(a, (float32x4_t *) p);

I managed to build SSE2NEON on Aarch64/Linux.

$ clang++-15 -v
Ubuntu clang version 15.0.6
Target: aarch64-unknown-linux-gnu
$ make CXX=clang++-15
clang++-15 -o tests/common.o -Wall -Wcast-qual -I. -march=armv8-a+fp+simd -std=gnu++14 -c -MMD -MF tests/common.o.d tests/common.cpp
clang++-15 -o tests/impl.o -Wall -Wcast-qual -I. -march=armv8-a+fp+simd -std=gnu++14 -c -MMD -MF tests/impl.o.d tests/impl.cpp
clang++-15 -o tests/main.o -Wall -Wcast-qual -I. -march=armv8-a+fp+simd -std=gnu++14 -c -MMD -MF tests/main.o.d tests/main.cpp
clang++-15 -lm -o tests/main tests/binding.o tests/common.o tests/impl.o tests/main.o

It succeeded. However, when option -flax-vector-conversions is passed to clang, I got the same errors:

./sse2neon.h:5457:33: error: cannot initialize a parameter of type 'float32x4_t' (vector of 4 'float32_t' values) with an lvalue of type '__m128d' (aka 'float64x2_t')
    __builtin_nontemporal_store(a, (float32x4_t *) p);
                                ^
1 error generated.

One possible solution is to get rid of option -flax-vector-conversions.

Duplicated: #571