nan loss

Question

nan loss

jimmy-dq opened this issue 2 years ago · comments

Hi junchen,
Thanks for your great work. Recently when I try your local reconstruction training, I found the NaN loss is easily occurred during the training. Any suggestions for this? Thanks.

Jun Chen · Answer 1 · Wed Aug 03 2022 18:06:33 GMT+0800 (China Standard Time)

Hi can you share more about your training details so I can better understand the problem. Actually I seldom face nan problem in my experiments.

jimmy · Answer 2 · Wed Aug 03 2022 22:58:55 GMT+0800 (China Standard Time)

Hi can you share more about your training details so I can better understand the problem. Actually I seldom face nan problem in my experiments.

Thanks for your reply. I disable the iRPE and use my own dataset for training. The nan loss occurs at the beginning (i.e., epoch-8). This problem is solved when I disable the AMP training, i.e. amp.autocast(enabled=False).