Image normalization and VIT
GhostLate opened this issue · comments
I noticed, that there is only one transform: ToTensor()
in the DataLoader.
Why don't you use image normalization (mean, std) before first VIT's layers?
Hi @GhostLate , we follow OSX in training the transformer backbones. We didn't conduct extensive experiments on training details. However, some tuning here and there may be useful.