deepspeech-gpu's TensorFlow tries loading old CUDA libraries

Question

deepspeech-gpu's TensorFlow tries loading old CUDA libraries

Geremia opened this issue 2 years ago · comments

Geremia commented 2 years ago

Have I written custom code: no
OS Platform and Distribution: Slackware Linux Current, kernel 5.15.7
TensorFlow installed from: your build
TensorFlow version: v2.3.0-6-g23ad988
Python version: 3.9.9
CUDA/cuDNN version: 11.5 / 8.3.1.22
GPU model and memory: Quadro RTX 4000, 8192MB
Deepspeech version: 0.10.0-alpha.3 (I get the same issue on 0.9.3, too.)
Exact command to reproduce: deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models.scorer --audio in.wav --extended --json

deepspeech-gpu's Tensorflow tries loading:

libcublas.so.10
libcudart.so.10.1
libcudnn.so.7
libcusolver.so.10,

which are not part of Cudatoolkit 11.5.0. If I try making symlinks of these to

libcublas.so.11.7.3.1
libcudart.so.11.5.50
libcudnn.so.8.3.1
libcusolver.so.11.2.1.48,

respectively, deepspeech produces a "*** stack smashing detected ***: terminated" error.

deepseech does successfully load:

libcuda.so.1
libcufft.so.10
libcurand.so.10
libcusparse.so.10,

which are in Cudatoolkit 11.5.0.

Command output:

2021-12-17 15:48:27.737042: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Loading model from file deepspeech-0.9.3-models.pbmm
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.10.0-alpha.3-0-gfcbd92d
2021-12-17 15:48:27.976114: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-17 15:48:27.979773: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2021-12-17 15:48:28.013417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:42:00.0 name: Quadro RTX 4000 computeCapability: 7.5
coreClock: 1.545GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 387.49GiB/s
2021-12-17 15:48:28.013548: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-12-17 15:48:28.013622: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory
2021-12-17 15:48:28.060369: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-12-17 15:48:28.060745: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-12-17 15:48:28.060828: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2021-12-17 15:48:28.061872: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-12-17 15:48:28.061958: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-12-17 15:48:28.061973: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-12-17 15:48:28.148667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-12-17 15:48:28.148704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2021-12-17 15:48:28.148715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
Loaded model in 0.19s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.00017s.
Running inference.

lissyx · Answer 1 · Sat Dec 18 2021 17:18:20 GMT+0800 (China Standard Time)

There is no bug here, it depends on cuda 10 as documented.

Geremia · Answer 2 · Sun Dec 19 2021 01:17:02 GMT+0800 (China Standard Time)

@lissyx What is that? Is there something better about Cuda 10?

lissyx · Answer 3 · Sun Dec 19 2021 01:51:02 GMT+0800 (China Standard Time)

It's the version that is supported by the tensorflow version we were using for those releases.

Geremia · Answer 4 · Sun Dec 19 2021 02:43:17 GMT+0800 (China Standard Time)

@lissyx Thanks. Looks like I'll have to build it with bazel, then. ☺