how does the loss converge
jyshee opened this issue · comments
jyshee commented
Does the AMS loss have the similar converge curve with softmax loss? In my exps, the AMS loss (set m=1) changes little during training, even after lots of iters.
Feng Wang commented
Loss is meaningless in AMS. Maybe you can output the accuracy, target logit etc to observe the convergence of the network.