lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About img size in Distillation

Eric-YuanL opened this issue · comments

Hi,

May i ask why img = torch.randn(1, 3, 256, 256) in ViT Usage, but img = torch.randn(2, 3, 256, 256) in Distillation Usage?

What's the difference while 1 changing to 2?