loss explodes when w_smooth=0 and w_ternary=0

Question

loss explodes when w_smooth=0 and w_ternary=0

awaelchli opened this issue 4 years ago · comments

Hi, thanks for sharing the code.
I noticed that if I train without smoothing loss and without ternary loss, the losses explode at around epoch 7 and then eventually become NaN.
Did you also observe that in your experiments and do you have any ideas what could cause this behaviour?

Liang Liu · Answer 1 · Mon Aug 17 2020 22:21:16 GMT+0800 (China Standard Time)

I do not try this case, but I can give some explanation from my point of view:

When training with our pipeline, the smooth loss and photometric loss provide a basic guarantee that the network can always perform well on regular samples, and the augment loss encourage the network to try on challenging samples with the guide of first forward pass.

If you training without ternary loss and smooth loss, the network lost the basic guarantee on regular samples, which makes the training unstable.

Besides, you can have a freeze model in the first forward pass so that you can train without smooth loss and ternary loss in a model distillation way.