train epoch acc have no change
myegdfz opened this issue · comments
Hi! could you elaborate on what did you change to the code?
Thanks for the details. I would suggest for making sure that the original architecture will decrease the training error in a small dataset.
Only after that I would start making adjustments to the model architecture and follow the same process. Make sure you test adam/modified SGD with different learning rates. By making big adjustments it's hard to identify the reason of non convergence. Hope I was helpful
do you mean if my initial loss is too high, it would be stoped very fast? maybe at iteration 4.
No, I was meaning that making sure you can guarantee first that the architecture converges in a small dataset and that the training loss is decreasing.
I would suggest reading this blog post by Andrej Karpathy, especially the overfit section: