yahoo / open_nsfw

Not Suitable for Work (NSFW) classification using deep neural network Caffe models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the input size of the Image

chibai opened this issue · comments

This is a very stupid question.

I saw you resize the image to 256*256 in python script

but in line 6 of deploy.prototxt, input_param { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } },
which means the input images should be 3 channels with size 224*224

so, what's the exactly right input size of the image?? 256 or 224??
Or I just misunderstood the python script and caffe structure??

commented

The script does this:
load an image;
resize image to 256x256 (line 64)
centre crop resized image to 224x224 (line 71)

So the input image of the model is 3x224x224

The resize to 256x256 and random crop to 224x224 is done during training for data augmentation. Refer the residual networks paper section 3.4. During runtime / test we take center crop after resize.