tamerthamoqa / facenet-pytorch-glint360k

A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RuntimeError: Given input size: (1536x5x5). Calculated output size: (1536x0x0). Output size is too small

JinhaSong opened this issue · comments

I tried to run a test that trains only 20 of VGGFace2's train data.
And I tried to use inceptionresnet v2 as the network architecture, but it does not run with the following error.

RuntimeError: Given input size: (1536x5x5). Calculated output size: (1536x0x0). Output size is too small

Please let me know if there is any guessing or wrong part of the reason.

Hello JinhaSong,

Did you specify the input image to be 299x299? I think the minimum should be 150x150 but I am not sure on that to be honest.

For reference, I have imported the Inception-ResNet-V2 implementation from Cadene's repository here:
https://github.com/Cadene/pretrained-models.pytorch/blob/master/pretrainedmodels/models/inceptionresnetv2.py

Hi JinhaSong,
I had similar issue and I managed to fix it by setting AvgPool2d kernel_size to 5. Since the
VGGFace2 images are smaller the current network at that layer ends up with tensor 1536x5x5 instead of 1536x8x8.
Applying pooling with kernel_size 8 which is larger that what is provided by the previous layer causes this issue. Hope that helps !