asyml / vision-transformer-pytorch

Pytorch version of Vision Transformer (ViT) with pretrained models. This is part of CASL (https://casl-project.github.io/) and ASYML project.

https://asyml.io/

Image resolution for training

a-maiti opened this issue 3 years ago · comments

Abhishek Maiti commented 3 years ago

What was the image resolution used for training on ImageNet? On paper, it's written 224, but it seems 384 was used for this code?

Abhishek Maiti commented 3 years ago

Nevermind, got it. It is correct.