Deconvolutional Layer is slow for large kernel size

Question

Deconvolutional Layer is slow for large kernel size

yangwenca opened this issue 5 years ago · comments

Deconvolutional layer is very slow for large kernel size. Kernel size is 16x16x4x2, input size is 32x32x4, output size is 256x256x2, padding is 4 (top, bottom, left, right), stride is 8 (height, width) and dilation is 1 (height width). The run time can be as slow as a few hundred ms.

Marat Dukhan · Answer 1 · Tue Feb 12 2019 15:25:42 GMT+0800 (China Standard Time)

This isn't surprising: you use a 16x16 kernel, which is 256X more expensive than 1x1 kernel.

SystemErrorWang · Answer 2 · Fri Feb 22 2019 11:44:28 GMT+0800 (China Standard Time)

@Maratyszcza would like to know whether we can use torch.nn.upsample here to substitute convtranspose2d, or is torch.nn.upsample (linear, bilinear, trilinear) supported in qnnpack?