Cuda is not available
filevich opened this issue · comments
CPU only works fine; but when I run
gotch.NewCuda().CudaIfAvailable()
I get "Cuda is not available."
Can anybody help me out?
I have very carefully followed every step in the installation guide. And even got the creating $GOTCH_LIB_FILE for GPU
message at the end of the installation process.
OS: Ubuntu 20.04
GPU: RTX 3060
/usr/local/cuda-11.3 ✅✅
$ ls /usr/local/cuda/include | grep cudnn
cudnn.h
$ ls /usr/local/cuda/lib64 | grep cudnn
libcudnn_adv_infer.so
libcudnn_adv_infer.so.8
(...)
libcudnn_static.a
libcudnn_static_v8.a
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0
$ echo $LD_LIBRARY_PATH;
$ echo $CUDA_VERSION;
$ echo $CUDA_VERSION;
$ echo $CU_VERSION;
$ echo $GOTCH_LIBTORCH;
$ echo $LIBRARY_PATH;
$ echo $CPATH;
$ echo $LD_LIBRARY_PATH;
/usr/local/cuda-11.3/lib64::/usr/local/lib/libtorch/lib:/usr/lib64-nvidia:/usr/local/cuda-11.3/lib64
11.3
11.3
/usr/local/lib/libtorch
:/usr/local/lib/libtorch/lib
:/usr/local/lib/libtorch/lib:/usr/local/lib/libtorch/include:/usr/local/lib/libtorch/include/torch/csrc/api/include
/usr/local/cuda-11.3/lib64::/usr/local/lib/libtorch/lib:/usr/lib64-nvidia:/usr/local/cuda-11.3/lib64
$ nvidia-smi
Tue Aug 23 00:56:51 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:03:00.0 On | N/A |
| 0% 39C P8 19W / 170W | 283MiB / 12288MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1049 G /usr/lib/xorg/Xorg 29MiB |
| 0 N/A N/A 1567 G /usr/lib/xorg/Xorg 97MiB |
| 0 N/A N/A 1701 G /usr/bin/gnome-shell 40MiB |
| 0 N/A N/A 2081 G ...867554538896762554,131072 65MiB |
| 0 N/A N/A 31849 G ...RendererForSitePerProcess 39MiB |
+-----------------------------------------------------------------------------+
One interesting thing:
when I run the go program with go run *.go
it prints the error message "Cuda is not available." as I mentioned before; but when I run go build *.go && ./main
it gets stuck. No output, nothing.
nvidia drivers + CUDA 11.3 + cudnn were installed using this gist / script in a fresh Ubuntu 20.04 partition
then installed libtorch according to the README guide using export CUDA_VER=11.3 && bash setup-libtorch.sh
and finally gotch using export CUDA_VER=11.3 && export GOTCH_VER=v0.7.0 && bash setup-gotch.sh
no errors linking nor compiling. everything just executed fine as supposed.
I have already tried with Ubuntu 22.04 + CUDA 11.7 + libtorch 1.12 (not 1.11) but wouldn't even compile.
Maybe I'll try Ubuntu 18 + CUDA 10.2
Or downgrading nvidia drivers
🤷♂️🤷♂️
Any help appreciated
Hi @filevich,
I don't have any machines with CUDA 11.3 now. However, please have a look at Google colab I setup with Gotch and CUDA 11.3 here.
Maybe you should delete libtorch at /usr/local/lib/libtorch
and resinstall. Also, try to use clang
instead of gcc
for c compiler as in the Google colab.
try
package main
import (
"fmt"
"github.com/sugarme/gotch"
"github.com/sugarme/gotch/ts"
)
func main() {
device := gotch.CudaIfAvailable()
fmt.Println(device)
x := ts.MustOnes([]int64{3, 4, 5}, gotch.Double, device)
fmt.Printf("%i", x)
}
I am using CUDA 11.1 with one of RTX 3060 and running just okay.