question about the Neural network explosion

Question

question about the Neural network explosion

zhangtzq opened this issue a year ago · comments

Hi, I have reproduced the code for the image attribution. I get the neural network explosion during training. Surprisingly, I got an accuracy of about 85% before the explosion. I want to know why. I observe that when the explosion happens, the weights of the net obtain nan and the CE loss increases. Can you help me to solve the issue? I used the Adam optimizer.

vishal3477 · Answer 1 · Wed Mar 15 2023 01:14:53 GMT+0800 (China Standard Time)

Hi,
Can you provide more details about the error? Usually, because of using fft functions of pytorch. I would advise you to restart from the last good checkpoint with a lower learning rate to not allow the model change much.