train_wavernn.py CUDNN_STATUS_EXECUTION_FAILED on Windows 10 + CUDA 11 + cuDNN 8.0.4
serg06 opened this issue · comments
serg06 commented
System
- Windows 10 x64
- RTX 3080
- PyTorch 1.7.0
- CUDA 11.0
- cuDNN 8.0.4
Problem
When I try to run python train_wavernn.py
, it does a few steps then fails with one of these messages:
CUDNN_STATUS_EXECUTION_FAILED
CUDNN_STATUS_INTERNAL_ERROR
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution
Workaround 1:
(This only works on RTX 2000 and earlier.)
Downgrade PyTorch to PyTorch 1.7.0 + CUDA 10.2 + cuDNN 7.6.4
.
Workaround 2:
Switch to Linux, it works fine there even with CUDA 11 + cuDNN 8.04
.
Workaround 3:
(This slows down training 5x.)
At the top of train_wavernn.py
, do:
torch.backends.cudnn.enabled = True
Other notes
- I tried building
PyTorch 1.8.0a + CUDA 11.1 + cuDNN 8.0.4
from source, but that didn't fix the issue. - I tried building
PyTorch 1.8.0a + CUDA 11.1 + cuDNN 8.0.5
from source, but that didn't fix the issue.
Ollie McCarthy commented
Hi, I haven't tested this on the 3000 series cards. I'll be doing a bit of a spring clean soon and will check stuff like this. Thanks!