How CNN parameters depends on input image size?

Question

How CNN parameters depends on input image size?

mrgloom opened this issue 9 years ago · comments

I'm trying to modify example test_example_CNN.m to work with my images.
I have some pedestrian detection dataset where I have two classes positive - pedestrians and negative - background , images are 128*64 size, when I try to run code without changes error increases(!), but when I tried to resize images to 28x28 it worked.

So my question is how CNN parameters depens on image size?

Alex · Answer 1 · Tue Aug 25 2015 22:19:10 GMT+0800 (China Standard Time)

Same here. Is there a documentation for configuring the CNN?

Tambet Matiisen · Answer 2 · Tue Aug 25 2015 23:36:18 GMT+0800 (China Standard Time)

Try smaller learning rate. Usually you try learning rate in powers of 10, i.e. 0.1, 0.01, 0.001 and so on. Pick the first one, that makes your loss to decrease. Choosing good hyperparameters for deep networks is still an art, you can find few rules of thumb in these articles:

Y. Bengio, "Practical recommendations for gradient-based training of deep architectures", http://arxiv.org/abs/1206.5533
Y. LeCun et al, "Efficient BackProp", http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf
I. Sutskever, "A Brief Overview of Deep Learning ", http://yyue.blogspot.com/2015/01/a-brief-overview-of-deep-learning.html
T. M. Breuel, "The Effects of Hyperparameters on SGD Training of Neural Networks", http://arxiv.org/abs/1508.02788

Alex · Answer 3 · Wed Aug 26 2015 03:36:42 GMT+0800 (China Standard Time)

Thanks for the information. However I was interested in how to set up the structure of CNN here: https://github.com/rasmusbergpalm/DeepLearnToolbox/blob/master/tests/test_example_CNN.m#L15-L21

Tambet Matiisen · Answer 4 · Wed Aug 26 2015 17:11:05 GMT+0800 (China Standard Time)

I would start with some well-known architecture. CIFAR-10 examples are good start, if your images are not too big. Otherwise AlexNet, but AlexNet is way too big for DeepLearnToolbox to handle.

For example CIFAR-10 network in Caffe examples has worked well for me:
https://github.com/BVLC/caffe/blob/master/examples/cifar10/cifar10_quick_train_test.prototxt
Hopefully you can figure out the layer parameters from all this prototxt cruft.

Michał Martyniak · Answer 5 · Sat Oct 07 2017 03:05:32 GMT+0800 (China Standard Time)

I found this formula in Andrej Karpathy's CNN course and it worked for me:
(it's really simple after a while of thinking)

It assumes square images, vertical stride equals horizontal stride and a square kernel_size!

in_channels = 3 # nearly always, because image has 3 channels (3 matrices -> red, green, blue)
out_channels = (image_width - kernel_size + 2*padding) / stride + 1

# if you don't know what these variables mean, google it -> these are the basics of CNN

in_channels and out_channels are the parameters for one convolution layer, but each following layer's in_channels equals to number of out_channels from the previous one.