marian-nmt / marian-dev

Fast Neural Machine Translation in C++ - development repository

Home Page:https://marian-nmt.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cmake cannot find cuBLASLt

ZJaume opened this issue · comments

Bug description

cmake cannot find cuBLASLt even it is in the same directory as cuBLAS.

How to reproduce

Run cmake .. in a Lambda Labs Cloud machine.

Context

  • Marian version: 1.11.7
  • CMake command: cmake ..
  • Log file: Attach your training/decoding logs
-- The CXX compiler identification is GNU 9.4.0
-- The C compiler identification is GNU 9.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Project name: marian
-- Project version: v1.11.7+e27da623
Submodule 'examples' (https://github.com/marian-nmt/marian-examples) registered for path 'examples'
Submodule 'regression-tests' (https://github.com/marian-nmt/marian-regression-tests) registered for path 'regression-tests'
Submodule 'src/3rd_party/fbgemm' (https://github.com/marian-nmt/FBGEMM) registered for path 'src/3rd_party/fbgemm'
Submodule 'src/3rd_party/intgemm' (https://github.com/marian-nmt/intgemm/) registered for path 'src/3rd_party/intgemm'
Submodule 'src/3rd_party/nccl' (https://github.com/marian-nmt/nccl) registered for path 'src/3rd_party/nccl'
Submodule 'src/3rd_party/sentencepiece' (https://github.com/marian-nmt/sentencepiece) registered for path 'src/3rd_party/sentencepiece'
Submodule 'src/3rd_party/simple-websocket-server' (https://github.com/marian-nmt/Simple-WebSocket-Server) registered for path 'src/3rd_party/simple-websocket-server'
Cloning into '/home/ubuntu/marian/marian-dev/examples'...
Cloning into '/home/ubuntu/marian/marian-dev/regression-tests'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/fbgemm'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/intgemm'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/nccl'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/sentencepiece'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/simple-websocket-server'...
Submodule path 'examples': checked out '29f4f7c380c860a95b9375813f4b199b2e6b5556'
Submodule path 'regression-tests': checked out '4fa9ff55af68bc87d8bd04c9b410f1e1d3874718'
Submodule path 'src/3rd_party/fbgemm': checked out '6f45243cb8ab7d7ab921af18d313ae97144618b8'
Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'src/3rd_party/fbgemm/third_party/asmjit'
Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'src/3rd_party/fbgemm/third_party/cpuinfo'
Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'src/3rd_party/fbgemm/third_party/googletest'
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/fbgemm/third_party/asmjit'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/fbgemm/third_party/cpuinfo'...
Cloning into '/home/ubuntu/marian/marian-dev/src/3rd_party/fbgemm/third_party/googletest'...
Submodule path 'src/3rd_party/fbgemm/third_party/asmjit': checked out '4da474ac9aa2689e88d5e40a2f37628f302d7e3c'
Submodule path 'src/3rd_party/fbgemm/third_party/cpuinfo': checked out 'd5e37adf1406cf899d7d9ec1d317c47506ccb970'
Submodule path 'src/3rd_party/fbgemm/third_party/googletest': checked out '0fc5466dbb9e623029b1ada539717d10bd45e99e'
Submodule path 'src/3rd_party/intgemm': checked out 'a05a2e51ab524bcee954a39ee72005193f3adf7c'
Submodule path 'src/3rd_party/nccl': checked out '5dcf7751494f9d04057bfc6b4a2b64611bc12253'
Submodule path 'src/3rd_party/sentencepiece': checked out '5312a306c4c0a458e29a8882ebfb42a179aaf580'
Submodule path 'src/3rd_party/simple-websocket-server': checked out '1d7e84aeb3f1ebdc78f6965d79ad3ca3003789fe'
CMake Warning at CMakeLists.txt:79 (message):
  CMAKE_BUILD_TYPE not set; setting to Release


-- Building with -march=native and intrinsics will be chosen automatically by the compiler to match the current machine.
-- Checking support for CPU intrinsics
-- Could not find hardware support for AVX512 on this machine.
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr (found suitable version "11.6", minimum required is "9.0")
-- Compiling code for Pascal GPUs
-- Compiling code for Volta GPUs
-- Compiling code for Turing GPUs
-- Compiling code for Ampere GPUs
-- Compiling code for Ampere RTX GPUs
CMake Error at CMakeLists.txt:415 (message):
  cuBLASLt library not found


-- Configuring incomplete, errors occurred!
See also "/home/ubuntu/marian/marian-dev/build/CMakeFiles/CMakeOutput.log".
See also "/home/ubuntu/marian/marian-dev/build/CMakeFiles/CMakeError.log".
$ cmake .. -LAH -N | grep cublas
CUDA_cublasLt_LIBRARY:FILEPATH=CUDA_cublasLt_LIBRARY-NOTFOUND
// "cublas" library
CUDA_cublas_LIBRARY:FILEPATH=/usr/lib/x86_64-linux-gnu/libcublas.so
$ ll /usr/lib/x86_64-linux-gnu/libcublas*
lrwxrwxrwx 1 root root        15 Mar 28 19:23 /usr/lib/x86_64-linux-gnu/libcublas.so -> libcublas.so.11
lrwxrwxrwx 1 root root        23 Mar 28 19:23 /usr/lib/x86_64-linux-gnu/libcublas.so.11 -> libcublas.so.11.9.2.110
-rw-r--r-- 1 root root 156720544 Mar 18 17:42 /usr/lib/x86_64-linux-gnu/libcublas.so.11.9.2.110
lrwxrwxrwx 1 root root        17 Mar 28 19:23 /usr/lib/x86_64-linux-gnu/libcublasLt.so -> libcublasLt.so.11
lrwxrwxrwx 1 root root        25 Mar 28 19:23 /usr/lib/x86_64-linux-gnu/libcublasLt.so.11 -> libcublasLt.so.11.9.2.110
-rw-r--r-- 1 root root 350346136 Mar 18 17:42 /usr/lib/x86_64-linux-gnu/libcublasLt.so.11.9.2.110
-rw-r--r-- 1 root root 506801278 Mar 18 17:42 /usr/lib/x86_64-linux-gnu/libcublasLt_static.a
-rw-r--r-- 1 root root 187283786 Mar 18 17:42 /usr/lib/x86_64-linux-gnu/libcublas_static.a

With

cmake .. -DCUDA_cublasLt_LIBRARY=/usr/lib/x86_64-linux-gnu/libcublasLt.so

the issue is avoided but reported this in case someone is interested.