CoinCheung / pytorch-loss

label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

do buffered params in EMA need to be updated?

DietDietDiet opened this issue · comments

Hi, seems like the buffered_pamameters are not affected by the optimizers, namely, they remain unchanged. So I am wondering do these params need to calculated by EMA? Thanks!

Hi,

From my observation, there are two methods to deal with buffers, one is to process it with ema along with the parameters, and the other is to copy them directly from the model you are training. From my experience, I saw few difference between these two methods. There can be some performance difference, but I did not observer stable trend. Sometimes, implementing ema on buffers are better and sometimes the other works better, and the test gap between them is not big.