QConv1d: no valid convolution algorithms available in CuDNN

Question

QConv1d: no valid convolution algorithms available in CuDNN

xesdiny opened this issue 3 years ago · comments

In https://github.com/ucbrise/actnn/blob/main/tests/test_conv_layer.py line 52-56

I just got this message when trying to run test_conv_layer.py, getting the following stacktrace:

~/code/actnn/tests$ CUDA_VISIBLE_DEVICES=1 python test_conv_layer.py
Conv1d(100, 4, kernel_size=(3,), stride=(2,), groups=2)
QConv1d(100, 4, kernel_size=(3,), stride=(2,), groups=2)
torch.Size([4, 50, 3])
torch.Size([10, 100, 2000]) tensor([2, 0, 3, 0, 0, 2, 3, 0, 2, 1], device='cuda:0')
Traceback (most recent call last):
  File "test_conv_layer.py", line 60, in <module>
    test(layer, qlayer, x, y)
  File "test_conv_layer.py", line 33, in test
    grads.append(get_grad(qlayer))
  File "test_conv_layer.py", line 27, in get_grad
    loss.backward()
  File "/data/users/root/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/tensor.py", line 245, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/data/users/root/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/autograd/__init__.py", line 147, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
  File "/data/users/root/anaconda3/envs/jukebox/lib/python3.7/site-packages/torch/autograd/function.py", line 89, in apply
    return self._forward_cls.backward(self, *args)  # type: ignore
  File "/data/users/root/code/actnn/actnn/actnn/ops.py", line 244, in backward
    return convnd.run_backward(1, ctx, grad_output, [0, 2], _single)
  File "/data/users/root/code/actnn/actnn/actnn/ops.py", line 225, in run_backward
    [ctx.needs_input_grad[0], ctx.needs_input_grad[1]])
RuntimeError: no valid convolution algorithms available in CuDNN

X_Bee · Answer 1 · Wed Jul 14 2021 17:38:58 GMT+0800 (China Standard Time)

Is anybody there？

Lianmin Zheng · Answer 2 · Thu Jul 15 2021 05:05:45 GMT+0800 (China Standard Time)

conv1d has bugs. We forget to delete the code.

Please do not use conv1d. You have to manually rewrite your models with Conv2D.
You can think of conv1d as a special case of conv2d and rewrite your model, so this is not hard.
PyTorch also did this internally for nn.Conv1d

X_Bee · Answer 3 · Thu Jul 15 2021 11:45:42 GMT+0800 (China Standard Time)

Thx a lot~