Hardcoded lib path in build script causes error

Question

Hardcoded lib path in build script causes error

muslll opened this issue 6 months ago · comments

Description

Library path on setup_build is hardcoded. Some systems, such as Debian, do not install nvidia libraries to /usr/lib64/ when installing nvidia-cuda-dev and nvidia-cuda-toolkit through apt:

$ find /usr/ -name libcudart_static*
/usr/lib/x86_64-linux-gnu/libcudart_static.a

I solved this issue by doing a symlink:

ln -s /usr/lib/x86_64-linux-gnu/libcuda* /usr/lib64/

However, the correct solution for this issue would be to not hardcode paths or add conditions to it.

To Reproduce

poetry add cupy

On Debian:

$ uname -a
Linux anon 6.6.9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.6.9-1 (2024-01-01) x86_64 GNU/Linux

Installation

Source (pip install cupy)

Environment

OS                           : Linux-6.6.9-amd64-x86_64-with-glibc2.37
Python Version               : 3.11.7
CuPy Version                 : 13.0.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.26.3
SciPy Version                : 1.11.4
Cython Build Version         : 0.29.36
Cython Runtime Version       : None
CUDA Root                    : /usr
nvcc PATH                    : /usr/bin/nvcc
CUDA Build Version           : 12020
CUDA Driver Version          : 12000
CUDA Runtime Version         : 12020 (linked to CuPy) / 12000 (locally installed)
cuBLAS Version               : (available)
cuFFT Version                : 11001
cuRAND Version               : 10301
cuSOLVER Version             : (11, 4, 3)
cuSPARSE Version             : (available)
NVRTC Version                : (12, 0)
Thrust Version               : 200200
CUB Build Version            : 200200
Jitify Build Version         : <unknown>
cuDNN Build Version          : (not loaded; try `import cupy.cuda.cudnn` first)
cuDNN Version                : (not loaded; try `import cupy.cuda.cudnn` first)
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
cuSPARSELt Build Version     : None
Device 0 Name                : NVIDIA GeForce GTX 1050
Device 0 Compute Capability  : 61
Device 0 PCI Bus ID          : 0000:08:00.0

Additional Information

No response

Leo Fang · Answer 1 · Mon Jan 22 2024 08:57:38 GMT+0800 (China Standard Time)

Just curious, was building against distro packages ever work for CuPy? i.e. did CuPy v12 and earlier work with this setup, but v13 does not?

Kenichi Maehashi · Answer 2 · Mon Jan 22 2024 16:26:25 GMT+0800 (China Standard Time)

Maybe I should have let the linker find static libraries by specifying -lcudart_static, instead of discovering a file path on our own?

Leo Fang · Answer 3 · Mon Jan 22 2024 22:56:53 GMT+0800 (China Standard Time)

Not sure, would that work on Windows too?

Leo Fang · Answer 4 · Mon Jan 22 2024 22:58:27 GMT+0800 (China Standard Time)

My question is more about the distro package layout. If we have libs and headers scattered (e.g. /usr/lib/x86_64-linux-gnu/, /usr/include, ...) I suppose the old CUDA_PATH based logic never worked?

Leo Fang · Answer 5 · Tue Jan 23 2024 10:14:06 GMT+0800 (China Standard Time)

Maybe I should have let the linker find static libraries by specifying -lcudart_static

I guess it'd work!

Kenichi Maehashi · Answer 6 · Thu Jan 25 2024 14:59:29 GMT+0800 (China Standard Time)

@muslll, we've just merged the fix #8134. Could you try with the latest main branch?

@leofang

My question is more about the distro package layout. If we have libs and headers scattered (e.g. /usr/lib/x86_64-linux-gnu/, /usr/include, ...) I suppose the old CUDA_PATH based logic never worked?

I guess these distro-default paths are on the compiler/linker search path by default?

musl · Answer 7 · Fri Jan 26 2024 16:56:50 GMT+0800 (China Standard Time)

Thanks @kmaehashi 👍