Different image normalization mean/std in different code paths

Question

Different image normalization mean/std in different code paths

vadimkantorov opened this issue 4 years ago · comments

@ajabri I noticed that different code paths use different image normalization parameters.

Training Kinetics400 path: https://github.com/ajabri/videowalk/blob/0834ff9/code/utils/augs.py#L10-L11 :

IMG_MEAN = (0.4914, 0.4822, 0.4465)
IMG_STD  = (0.2023, 0.1994, 0.2010)

Evaluation DAVIS2017 path: https://github.com/ajabri/videowalk/blob/0834ff9/code/data/vos.py#L173:

mean, std = [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]

Both seem RGB format. Is it correct?

Why are they different? Does this lead to better accuracy? Thanks!

A. Jabri · Answer 1 · Sat Jan 30 2021 06:26:37 GMT+0800 (China Standard Time)

Hi @vadimkantorov,

Thanks for pointing this out! Oops, this was not intentional. The normalization statistics in the vos.py are the ImageNet stats. The stats used in training were from an old code base, I think they pertain to Kinetics. This should not significantly affect performance. I should remove normalization stats all together.

Vadim Kantorov · Answer 2 · Sat Jan 30 2021 06:52:22 GMT+0800 (China Standard Time)

So which should I use for training and my own evaluation? :)

The Kinetics ones?

I should remove normalization stats all together.

Does it mean you're suggesting using just the ImageNet ones?

A. Jabri · Answer 3 · Mon Feb 08 2021 03:17:07 GMT+0800 (China Standard Time)

Yes, the normalization statistics are similar, but it was a bug to not have them consistent. The more correct thing to do is use consistent normalization.

What I meant by removing those normalizations is replacing them with non-dataset specific constants; I don't think it has a strong effect.