rasbt / deeplearning-models

A collection of various deep learning architectures, models, and tips

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to choose the parameter num_epochs

annyWangAn opened this issue · comments

first, thanks for your excellent project. It's very friendly for beginner.
I have noticed that on dog-vs-cats dataset the hyper parameter of num_epochs is 100 which is bigger than that on CIFAR dataset. CIFAR dataset has more data and more classes. So why we need to train out net more on this dataset?How can I choose a proper hyper parameter of num_epochs. Looking forward for your response.

Hi there.

That's an interesting question. So for this project, I didn't do extensive hyperparameter tuning. If I have some time this summer, I wanted to modify the codes a bit including visualizations of the loss etc. like I did for the class I was teaching this semester: https://github.com/rasbt/stat453-deep-learning-ss21/blob/main/L14/1.1-vgg16.ipynb

Thank you for your reply. THis helps a lot. So when the input image becomes larger, we need to train more parameters. This is why we need to spend more time training our net. When the loss drops slowly, the train can be stoped. Am I right?

Yes, this is correct, depending on the architecture, there may be more parameters involved (e.g., if you have fully connected layers). There are a few architectures that have global average pooling layers only (no fully connected layers) where the number of parameters may be the same; however, the convolutional operations will take more time because you have to move the kernel over the image in more steps.

Also, yeah, it can increase the training time. You are right, if the loss doesn't change that much anymore, training can probably be stopped.