lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I want to use pretrained VITMAE to embed patches without any masks which inturn I plan to feed into a decoder, Can someone point me in the correct direction ?

aishik11 opened this issue · comments