LiJunnan1992 / DivideMix

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about overfitting

MrChenFeng opened this issue · comments

Hi,

Thanks so much for sharing your code and work!
I wonder have you tried asym noise at a low ratio? I tried some different noise mode such as mixing asym and sym together, sometimes the network seems overfit quickly in the initial epochs of warmup. Do you have any suggestions about modifying the loss and regularization tricks in this condition? Actually, I'm curious and confused about the relation between noise mode and loss distribution. Any suggestions will be highly appreciated!

Best,
Chen

Hi,
Have you tried to activate the confidence penalty that is used in asym noise? Usually asym noise is easier to overfit because the noise has structure.

Hi,
Actually, I added a weights hyperparameter for the confidence-regularization term. Seems it resulted in the loss distribution moved right-side as the weight get bigger but still one-peak distribution.
Sadly, it didn't work.

Can I know what kind of noise distribution do you use? You may want to also try different warm-up epochs and see which epoch results in more separation in the loss distribution. Moreover, a larger learning rate may also help.

I would say the noise mode I tried tend to be noisier. Such as one real class has may be blended with noisy samples from two or three more other classes.

It might be possible that there is too much noise? From my experience, the model needs to be able to learn something during warmup in order to start noise cleaning.