Training falled after 60000 iterations
qiuqiangkong opened this issue · comments
Hi, really brilliant code! when I run the pytorch implementation, I found the trianing and validation loss increases dramtically after 60000 iterations. The loss curve looks like: https://drive.google.com/open?id=1_64-jD3hOtXmrOoVMq5hD8pWfmEUc7yE
Do you have any idea of this increased loss? Thank you very much!
Qiuqiang
Many thanks for the reply! I fixed out the problem. In the adam optimizer, I need to set this flag to true: amsgrad=True. Then it will fix this problem.
@qiuqiangkong Hi qiuqiang, I am using tensorflow to implement SampleRNN and Ive encountered the same problem, however I dont think in tf.train.adamoptimizer they have this amsgrad flag. Any idea how to troubleshoot this? Thanks in advance!
@zguo008 Interesting! I think keras has the amsgrad flag. Or try Rmsprop optimizer instead. I guess this phenomenon is caused by optimization. Let me know if you still not solve this problem.
Hi qiuqiang, thanks for your reply:) I’ll try RMSprop later. I tuned down value of epsilon in Adam optimizer and it seems that there’s no sudden increase in loss for now. Thank you for ur answer! It’s some great help
@zguo008 That is great! What epsilon are you using now?