Is it real that 1000 epochs are need
lovekeyczw opened this issue · comments
In my lab, one epoch uses about 800s, 1000 epochs are too large.
In my lab, one epoch uses about 800s, 1000 epochs are too large.
Don't you get a KeyError somewhere after the 10th epoch? Mine takes a long time too but stops due to this error.
I also find this strange, I guess you're supposed to just stop it yourself when it converges? Seems to make reproducibility difficult to navigate maybe.
Also the epochs seem to reset to 0 often so it's hard to know which one you're on.
It's very confusing to know when to stop the training.
I just finished running 1000 epochs on a machine with 4 Titan V cards, which took 15 days. However, it reached maximum test accuracy at epoch 15 (~78.3%). After that it seems to oscillate between 77% and 76% through the end of the training.
Hello, the experiments in the paper were done on 50 epochs or less.
Thanks @alinajadebarnett !
So that means that 30-50ish epochs with the augmented dataset were done (which is x30 times bigger), and then an additional 20 epochs for fine tuning the output linear layer after projection?