RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

Question

RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

zxy630 opened this issue a year ago · comments

I try to use ‘./deeplesion/eval.sh ./deeplesion/mconfigs/densenet_a3d.py ./deeplesion/model_weights/adap_7slice_weigts.pth’ but I get this wrong information. It's been bothering me for days......

Here is the info
'''
./deeplesion/mconfigs/densenet_a3d.py
a3d 7 slice
[ ] 0/160, elapsed: 0s, ETA:Traceback (most recent call last):
File "./deeplesion/eval.py", line 210, in
main(checkpoint, cfg_path)
File "./deeplesion/eval.py", line 196, in main
outputs = single_gpu_test(model, dl)
File "./deeplesion/eval.py", line 101, in single_gpu_test
r = model(return_loss=False, rescale=False, **data)
File "/disk/user/zxy/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/disk/user/zxy/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/disk/user/zxy/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/disk/user/zxy/project/AlignShift/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/disk/user/zxy/project/AlignShift/mmdet/models/detectors/base.py", line 122, in forward
return self.forward_test(img, img_meta, **kwargs)
File "/disk/user/zxy/project/AlignShift/mmdet/models/detectors/base.py", line 105, in forward_test
return self.simple_test(imgs, img_metas, **kwargs)
File "/disk/user/zxy/project/AlignShift/mmdet/models/detectors/two_stage.py", line 268, in simple_test
x = self.extract_feat(img)
File "/disk/user/zxy/project/AlignShift/mmdet/models/detectors/two_stage.py", line 92, in extract_feat
x = self.backbone(img)
File "/disk/user/zxy/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/disk/user/zxy/project/AlignShift/nn/models/truncated_densenet3d_a3d.py", line 168, in forward
x = self.conv0(x)
File "/disk/user/zxy/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/disk/user/zxy/project/AlignShift/nn/operators/a3dconv.py", line 59, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
'''

Hope your suggestions, thanks so much.

Yuiko630 · Answer 1 · Fri Aug 25 2023 21:51:36 GMT+0800 (China Standard Time)

My environment: PyTorch=1.3.1, torchvision=0.4.2, cuda=10.1.243, test in 3090 with 4 GPUs.

He Yi · Answer 2 · Sun Aug 27 2023 19:54:57 GMT+0800 (China Standard Time)

Hi,
It seems a cuda error, maybe caused by corrupted pytorch environment. you can try run a single conv module to check if the environment is in good condition. Reinstall pytorch may solve this, if thats the case.

Yuiko630 · Answer 3 · Mon Aug 28 2023 15:38:56 GMT+0800 (China Standard Time)

I have tried torch=1.3.1, 1.5.0, 1.7.1, 1.8.0 and still existed problems like this case.
I wonder which version you test, incluing torch, CUDA, GPU if convenient.
Thanks.

He Yi · Answer 4 · Tue Aug 29 2023 09:13:00 GMT+0800 (China Standard Time)

The traceback you provided shows that torch cant run conv module sucessfully. So try run single conv module to see if torch works, just like this:
import torch
conv = torch.nn.Conv2d(4, 16, 3).cuda()
x = torch.rand(2, 4, 128, 128) .cuda()# B,C,W,H
y = conv(x)

Yuiko630 · Answer 5 · Sat Sep 02 2023 15:24:53 GMT+0800 (China Standard Time)

Excuse.
eval is well done, but when i train, it happened error.
'''
Traceback (most recent call last):
File "./deeplesion/train_dist.py", line 121, in
main(args)
File "./deeplesion/train_dist.py", line 116, in main
logger=logger)
File "/home/zhangyi/workplace/AlignShiftv2/mmdet/apis/train.py", line 68, in train_detector
_dist_train(model, dataset, cfg, validate=validate)
File "/home/zhangyi/workplace/AlignShiftv2/mmdet/apis/train.py", line 204, in _dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/mmcv/runner/runner.py", line 358, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/mmcv/runner/runner.py", line 260, in train
for i, data_batch in enumerate(data_loader):
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 346, in next
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/zhangyi/workplace/AlignShiftv2/deeplesion/dataset/DeepLesionDataset_a3d.py", line 110, in getitem
results = self.pre_pipeline(results)
File "/home/zhangyi/workplace/AlignShiftv2/mmdet/datasets/pipelines/compose.py", line 24, in call
data1 = t(data)
File "/home/zhangyi/workplace/AlignShiftv2/mmdet/datasets/pipelines/transforms.py", line 817, in call
results = self.aug(**results)
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/albumentations/core/composition.py", line 158, in call
data = t(force_apply=force_apply, **data)
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/albumentations/core/transforms_interface.py", line 65, in call
res[key] = target_function(arg, **dict(params, **target_dependencies))
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/albumentations/augmentations/transforms.py", line 513, in apply
return F.shift_scale_rotate(img, angle, scale, dx, dy, interpolation, self.border_mode, self.value)
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/albumentations/augmentations/functional.py", line 58, in wrapped_function
result = func(img, *args, **kwargs)
File "/home/zhangyi/anaconda3/envs/a3d/lib/python3.6/site-packages/albumentations/augmentations/functional.py", line 168, in shift_scale_rotate
img = cv2.warpAffine(img, matrix, (width, height), flags=interpolation, borderMode=border_mode, borderValue=value)
cv2.error: OpenCV(4.1.0) /io/opencv/modules/imgproc/src/imgwarp.cpp:2597: error: (-215:Assertion failed) _src.channels() <= 4 || (interpolation != INTER_LANCZOS4 && interpolation != INTER_CUBIC) in function 'warpAffine'
'''

I have tried a lot of cv versions but doesn't work. Can you give me some tips?

He Yi · Answer 6 · Sun Sep 03 2023 22:20:54 GMT+0800 (China Standard Time)

Checking the albumentations version, and using compatible opencv.