pytorch / QNNPACK

Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators

Home Page:https://code.fb.com/ml-applications/qnnpack/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deconvolutional Layer is slow for large kernel size

yangwenca opened this issue · comments

Deconvolutional layer is very slow for large kernel size. Kernel size is 16x16x4x2, input size is 32x32x4, output size is 256x256x2, padding is 4 (top, bottom, left, right), stride is 8 (height, width) and dilation is 1 (height width). The run time can be as slow as a few hundred ms.

This isn't surprising: you use a 16x16 kernel, which is 256X more expensive than 1x1 kernel.

@Maratyszcza would like to know whether we can use torch.nn.upsample here to substitute convtranspose2d, or is torch.nn.upsample (linear, bilinear, trilinear) supported in qnnpack?