Failing build on Arch Linux
lahwaacz opened this issue · comments
The current develop
branch fails to build on Arch Linux:
[856/1238] Linking CUDA executable cuda/test/base/array
FAILED: cuda/test/base/array
: && /opt/cuda/bin/g++ -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,-rpath -Wl,/usr/lib -Wl,--enable-new-dtags cuda/test/base/CMakeFiles/cuda_test_base_array.dir/array.cu.o -o cuda/test/base/array -Wl,-rpath,/build/ginkgo-hpc-git/src/build/lib lib/libginkgo.so.1.7.0 lib/libginkgo_omp.so.1.7.0 lib/libginkgo_cuda.so.1.7.0 -ldl lib/libginkgo_reference.so.1.7.0 lib/libginkgo_hip.so.1.7.0 lib/libginkgo_dpcpp.so.1.7.0 lib/libginkgo_device.so.1.7.0 /usr/lib/libhwloc.so /usr/lib/libhwloc.so /usr/lib/libmpi_cxx.so /usr/lib/libmpi.so /usr/lib/libgtest_main.so.1.13.0 /usr/lib/libgtest.so.1.13.0 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -L"/opt/cuda/targets/x86_64-linux/lib/stubs" -L"/opt/cuda/targets/x86_64-linux/lib" && :
/usr/bin/ld: lib/libginkgo.so.1.7.0: undefined reference to `std::ios_base_library_init()@GLIBCXX_3.4.32'
/usr/bin/ld: lib/libginkgo_cuda.so.1.7.0: undefined reference to `std::ios_base_library_init()'
collect2: error: ld returned 1 exit status
There are many more linking errors like this.
Could you provide us your used compiler and cuda versions? Maybe that has some incompatibilities. (I think detailed.log
in the build directory should have the necessary information)
The detailed.log
contains:
CMAKE_CXX_COMPILER: GNU 13.2.1 on platform Linux x86_64
CMAKE_CUDA_COMPILER: /opt/cuda/bin/nvcc
CMAKE_CUDA_COMPILER_VERSION: 12.2.91
CMAKE_CUDA_HOST_COMPILER: <empty>
The CUDA host compiler is actually set via /opt/cuda/bin/g++ -> /usr/bin/g++-12
symlink, Arch Linux has g++-12 (GCC) 12.3.0
Could you use the host compiler for compiling all of ginkgo?
After some comments from CMake developers, we recently moved away from setting the CMAKE_CUDA_HOST_COMPILER inside ginkgo, since providing a compatible environment cannot be our responsibility. You can use the CUDAHOSTCXX
environment variable to specify which host compiler to use
Sounds like the default compiler Arch uses is incompatible with the CUDA version available: https://forums.developer.nvidia.com/t/identifier-float32-is-undefined-etc-cuda-12-2-0-gcc-13-1/258930
Building everything with the CUDA package-provided gcc 12 seems to work on Arch.
Sounds like the default compiler Arch uses is incompatible with the CUDA version available: https://forums.developer.nvidia.com/t/identifier-float32-is-undefined-etc-cuda-12-2-0-gcc-13-1/258930
That's exactly why Arch Linux provides the /opt/cuda/bin/g++ -> /usr/bin/g++-12
symlink.
Building everything with the CUDA package-provided gcc 12 seems to work on Arch.
For me passing -DCMAKE_C_COMPILER=gcc-12 -DCMAKE_CXX_COMPILER=g++-12
to cmake results in error due to cmake not finding MPI:
-- Could NOT find VTune (missing: VTune_EXECUTABLE VTune_LIBRARY VTune_INCLUDE_DIR)
-- Could NOT find METIS (missing: METIS_LIBRARY METIS_INCLUDE_DIR)
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS) (Required is at least version "3.1")
CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find MPI (missing: MPI_CXX_FOUND CXX) (Required is at least
version "3.1")
Call Stack (most recent call first):
/usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake/Modules/FindMPI.cmake:1837 (find_package_handle_standard_args)
CMakeLists.txt:239 (find_package)
Same thing happens when I set the compilers via the CC
and CXX
environment variables. Of course the openmpi
package is installed and the configure step actually passed before I started setting the compilers.
That's the same issue (From the CMakeConfigureLog.yaml)
kind: "try_compile-v1"
backtrace:
- "/usr/share/cmake/Modules/FindMPI.cmake:1278 (try_compile)"
- "/usr/share/cmake/Modules/FindMPI.cmake:1322 (_MPI_try_staged_settings)"
- "/usr/share/cmake/Modules/FindMPI.cmake:1645 (_MPI_check_lang_works)"
- "CMakeLists.txt:2 (find_package)"
description: "The MPI test test_mpi for CXX in mode normal"
directories:
source: "/test/build/CMakeFiles/CMakeScratch/TryCompile-BvetGK"
binary: "/test/build/CMakeFiles/CMakeScratch/TryCompile-BvetGK"
cmakeVariables:
CMAKE_CXX_FLAGS: ""
buildResult:
variable: "MPI_RESULT_CXX_test_mpi_normal"
cached: true
stdout: |
Change Dir: '/test/build/CMakeFiles/CMakeScratch/TryCompile-BvetGK'
Run Build Command(s): /usr/sbin/ninja -v cmTC_33b4e
[1/2] /opt/cuda/bin/g++ -o CMakeFiles/cmTC_33b4e.dir/test_mpi.cpp.o -c /test/build/CMakeFiles/CMakeScratch/TryCompile-BvetGK/test_mpi.cpp
[2/2] : && /opt/cuda/bin/g++ -rdynamic -Wl,-rpath -Wl,/usr/lib -Wl,--enable-new-dtags CMakeFiles/cmTC_33b4e.dir/test_mpi.cpp.o -o cmTC_33b4e /usr/lib/libmpi_cxx.so /usr/lib/libmpi.so && :
FAILED: cmTC_33b4e
: && /opt/cuda/bin/g++ -rdynamic -Wl,-rpath -Wl,/usr/lib -Wl,--enable-new-dtags CMakeFiles/cmTC_33b4e.dir/test_mpi.cpp.o -o cmTC_33b4e /usr/lib/libmpi_cxx.so /usr/lib/libmpi.so && :
/usr/sbin/ld: /usr/lib/libmpi_cxx.so: undefined reference to `std::ios_base_library_init()@GLIBCXX_3.4.32'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
For this to work, libmpi_cxx.so needs to be compiled for the right standard library.
Actually, the gcc12
package in Arch needs to be rebuilt after a recent glibc
package update... I'll let the packagers know, sorry for the noise.
@lahwaacz do you want to keep this open for CUDA/HIP support?
@upsj This was already fixed...?
@lahwaacz You can't build CUDA and HIP with _GLIBCXX_DEBUG
support, if you try you run into compilation issues, if you don't you run into ABI-related linker issues
@upsj Well the asserts are a different issue 🤷