do buffered params in EMA need to be updated?

Question

do buffered params in EMA need to be updated?

DietDietDiet opened this issue 4 years ago · comments

Hi, seems like the buffered_pamameters are not affected by the optimizers, namely, they remain unchanged. So I am wondering do these params need to calculated by EMA? Thanks!

CoinCheung · Answer 1 · Thu Aug 13 2020 10:17:24 GMT+0800 (China Standard Time)

Hi,

From my observation, there are two methods to deal with buffers, one is to process it with ema along with the parameters, and the other is to copy them directly from the model you are training. From my experience, I saw few difference between these two methods. There can be some performance difference, but I did not observer stable trend. Sometimes, implementing ema on buffers are better and sometimes the other works better, and the test gap between them is not big.