danoneata / xts

being a multi-speaker video-to-speech network

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Output size is too small

csapot opened this issue · comments

after following the readme (install requirements, download data, etc), I tried to run a basic training:

python train.py --hparams magnus -d grid --filelist k-s01 -v

but there is an issue with that:

Current run is terminating due to exception: Given input size: (512x2x2). Calculated output size: (512x-4x-4). Output size is too small.
Engine run is terminating due to exception: Given input size: (512x2x2). Calculated output size: (512x-4x-4). Output size is too small.

do you know why this happens?

Hello! I believe you are using an older version of torchvision: in their previous implementation of ResNet the last layer was AvgPool2d but it was subsequently replaced with AdaptiveAvgPool2d, which pools information across the entire image and, crucially, outputs a tensor of size D × 1 × 1 regardless of the size of the input image. I hope this helps!

yes, torchvision update solved it, now the training is running. thanks for the quick answer!
maybe it would be worth adding this specific version to requirements.txt?