lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CCT and non-square images

Gasp34 opened this issue · comments

I think the actual implementation of CCT doesn't allow to pass non-square image.

If I pass img_size as a tuple I managed do it by changing these two lines

height=img_size,
width=img_size),
with

height=img_size[0],
width=img_size[1]),

Maybe you could modify the code in this repo in order that it works when img_size is an int or a tuple.

@Gasp34 oh yup, sure 9cd56ff

thx a lot !