CUDA error: no kernel image is available for execution on the device

Question

CUDA error: no kernel image is available for execution on the device

Brin333 opened this issue a year ago · comments

Under the 'ToothGroupNetwork-challenge_branch' fold, exec the code:
python inference_final.py --input_path ./tmp --save_path ./results

Error::
Error(s) in loading state_dict for TfCblFirstModule:
While copying the parameter named "first_ins_cent_model.enc1.0.linear.weight", whose dimensions in the model are torch.Size([32, 6]) and whose dimensions in the checkpoint are torch.Size([32, 6]), an exception occurred : ('CUDA error: no kernel image is available for execution on the device',).

Zauraiz Alamgeer · Answer 1 · Wed Aug 30 2023 20:35:37 GMT+0800 (China Standard Time)

Hey I got the same error and here is how I fixed it.
Use Conda environment. Please remove your old environment and create new as follows and you should be good:

Create new environment

Give it a name and please use python3.8 and no later since newer version is not supported by a few dependencies.
$ conda create -n <env_name> python=3.8
$ conda activate <env_name>

Install this specific PyTorch and CUDA toolkit version

$ conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch

Then Install other Dependencies

$ pip install wandb
$ pip install --ignore-installed PyYAML
$ pip install open3d
$ pip install multimethod
$ pip install termcolor
$ pip install trimesh
$ pip install easydict

To remove the error related to libcublas.

$ conda install -c "nvidia/label/cuda-11.7.0" cuda

Then install pointops

$ cd external_libs/pointops & python setup.py install

Now you are good to. :)

Zauraiz Alamgeer · Answer 2 · Wed Aug 30 2023 21:00:14 GMT+0800 (China Standard Time)

Right now I am trying to figure out how to visualise and save the predicted results in .obj file. If someone can help that would be great. Really appreciate the great work.

Brin333 · Answer 3 · Thu Aug 31 2023 10:17:23 GMT+0800 (China Standard Time)

@ZauraizAlamgeer thank you very much, i have fixed! really appraciate!

Brin333 · Answer 4 · Mon Nov 27 2023 16:17:50 GMT+0800 (China Standard Time)

thank you very much, appreciate it. limhoyeon ***@***.***> 于2023年11月24日周五 08:45写道：

…

Closed #16 <#16> as completed. — Reply to this email directly, view it on GitHub <#16 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASMSRQF3ZUAXP7ZYDZOLYZTYF7U3VAVCNFSM6AAAAAA4EGSHLGVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGA2TCNZWGM3TSNA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

coordxyz · Answer 5 · Thu Jan 04 2024 12:14:46 GMT+0800 (China Standard Time)

@ZauraizAlamgeer Hi, thanks for sharing. I still have the same error after installing all dependencies in the conda environment, slightly different from yours according to my GPUs. Would you please give some advice? Thx

GPU: RTX 3080Ti
Python: 3.7.7
cuda: 11.3
open3d: 0.9.0
pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0

The error:
File"/ToothGroupNetwork/models/modules/cbl_point_transformer/cbl_point_transformer_module.py", line 104, in forward
p0 = pxo[:,:,:3].reshape(-1, 3).contiguous()
RuntimeError: CUDA error: no kernel image is available for execution on the device

Zauraiz Alamgeer · Answer 6 · Thu Jan 04 2024 12:59:58 GMT+0800 (China Standard Time)

Hi @coordxyz, I have RTX 3070 and the instructions mentioned in my comment work like a charm. Use Python 3.8 in conda environment.

If you still wish to keep your mentioned versions, then check if the following commands returns correct cuda version:
$ nvcc --version

If you see cuda 11.3 then run the following:
$ export CUDA_HOME= /path/to/your/cuda/

If you still face issues then I recommend to follow my previous instructions and it will definitely work.
Since you have RTX 3080Ti, you will need to add the following lines in external_libs/pointops/setup.py:

$ extra_compile_args={'cxx': ['-g'], 'nvcc': ['-O2', '-gencode', 'arch=compute_61,code=sm_61', '-gencode', 'arch=compute_75,code=sm_75', '-gencode', 'arch=compute_86,code=sm_86']}

Last resort, if $ nvcc --version returns empty, then run:
$ sudo apt-get install nvidia-cuda-toolkit

This is because pytorch installs cuda in runtime only and installing via apt-get will resolve symbolic link issues and install whole toolkit.

Let me know if you need any more help.

coordxyz · Answer 7 · Sat Jan 06 2024 15:21:02 GMT+0800 (China Standard Time)

Hi @ZauraizAlamgeer , thanks a lot for your advice. I finally successfully create the environment by using pip rather than conda to install pytorch.

GPU: RTX 3080Ti
Python: 3.7.7
cuda: 11.3
open3d: 0.9.0
torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1

BTW, I save the predicted result as *.obj file by simply adding
gu.save_mesh('output.obj', gu.get_colored_mesh(mesh, pred_labels))
in the end of eval_visualize_results.py.

Frq-F · Answer 8 · Tue Apr 23 2024 09:44:21 GMT+0800 (China Standard Time)

Hi @ZauraizAlamgeer , I have RTX 4090 .I installed the environment according to the method you provided, but still reported the error

Traceback (most recent call last):
File "preprocess_data.py", line 56, in
labeled_vertices = gu.resample_pcd([labeled_vertices], 24000, "fps")[0]
File "/workspace/ToothGroupNetwork-main/gen_utils.py", line 132, in resample_pcd
idx = fps(pcd_ls[0][:,:3], n)
File "/workspace/ToothGroupNetwork-main/gen_utils.py", line 142, in fps
idx = pointops.furthestsampling(xyz, torch.tensor([xyz.shape[0]]).cuda().type(torch.int), torch.tensor([npoint]).cuda().type(torch.int))

ericnlt · Answer 9 · Sat Apr 27 2024 13:33:06 GMT+0800 (China Standard Time)

Hi @ZauraizAlamgeer , thanks a lot for your advice. I finally successfully create the environment by using pip rather than conda to install pytorch.

GPU: RTX 3080Ti

Python: 3.7.7
cuda: 11.3
open3d: 0.9.0
torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1
BTW, I save the predicted result as *.obj file by simply adding gu.save_mesh('output.obj', gu.get_colored_mesh(mesh, pred_labels)) in the end of eval_visualize_results.py.

are you ubuntu or windows?

leonlee · Answer 10 · Thu Jun 20 2024 17:34:39 GMT+0800 (China Standard Time)

你好@ZauraizAlamgeer，非常感谢你的建议。我最终成功创建了环境，使用 pip 而不是 conda 来安装 pytorch。

GPU：RTX 3080Ti

Python：3.7.7
cuda：11.3
open3d：0.9.0
torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1
顺便说一句，我只需在 eval_visualize_results.py 末尾添加 gu.save_mesh('output.obj', gu.get_colored_mesh(mesh, pred_labels)) 即可将预测结果保存为 *.obj 文件。

你好，我在linux系统下，显卡也是3080ti，环境为pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0，但是在训练tsegnet的之心预测模型时，损失值全为nan，你知道这是什么原因吗

leonlee · Answer 11 · Thu Jun 20 2024 19:44:38 GMT+0800 (China Standard Time)

更新一下，我使用的服务器3090显卡，试用一下命令安装了torch和cuda: pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113，已解决cuda error报错信息

Ruitong Sun · Answer 12 · Sat Jun 29 2024 14:38:45 GMT+0800 (China Standard Time)

Hi @ZauraizAlamgeer I have the same RuntimeError: CUDA error: no kernel image is available for execution on the device. despite following the same procedures and package versions

GPU: RTX 3090
Python: 3.8
cuda: 11.0
open3d: 0.18.0
torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.0a0+a853dff

would you please help me have a look? Thank you so much!

Ruitong Sun · Answer 13 · Sat Jun 29 2024 14:56:57 GMT+0800 (China Standard Time)

cd external_libs/pointops & python setup.py install

Hi @ZauraizAlamgeer I have the same RuntimeError: CUDA error: no kernel image is available for execution on the device. despite following the same procedures and package versions

GPU: RTX 3090 Python: 3.8 cuda: 11.0 open3d: 0.18.0 torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.0a0+a853dff

would you please help me have a look? Thank you so much!

I can't really use pip instead of conda (like other users who solved the problem), because I don't have the root permission as a sudo user. So the cudatoolkit can only be installed through conda.

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB |
| 3 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------------------+

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

CUDA error: no kernel image is available for execution on the device

Create new environment

Install this specific PyTorch and CUDA toolkit version

Then Install other Dependencies

To remove the error related to libcublas.

Then install pointops

@ZauraizAlamgeer Hi, thanks for sharing. I still have the same error after installing all dependencies in the conda environment, slightly different from yours according to my GPUs. Would you please give some advice? Thx

GPU: RTX 3080Ti Python: 3.7.7 cuda: 11.3 open3d: 0.9.0 pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0

Hi @ZauraizAlamgeer , thanks a lot for your advice. I finally successfully create the environment by using pip rather than conda to install pytorch.

GPU: RTX 3080Ti Python: 3.7.7 cuda: 11.3 open3d: 0.9.0 torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1

Hi @ZauraizAlamgeer , thanks a lot for your advice. I finally successfully create the environment by using pip rather than conda to install pytorch.

GPU: RTX 3080Ti

你好@ZauraizAlamgeer，非常感谢你的建议。我最终成功创建了环境，使用 pip 而不是 conda 来安装 pytorch。

GPU：RTX 3080Ti

GPU: RTX 3080Ti
Python: 3.7.7
cuda: 11.3
open3d: 0.9.0
pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0

GPU: RTX 3080Ti
Python: 3.7.7
cuda: 11.3
open3d: 0.9.0
torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1