Hot to get the features and positional embeddig information
marcomameli1992 opened this issue · comments
Dear,
I would use your package but not for classification I need it to extract information from images and get these as output and in addition to that I need to get the positional embedding information to reconstruct the features images.
Thank you so much.
@marcomameli1992 hi Marco, you just need to modify ViT to have a return statement here https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/vit.py#L123 for the embeddings
i guess i could add this, but i don't want to cloud how simple and clear the code is atm
@marcomameli1992 what do you mean by the positional embedding? the absolute positional embeddings are added at the beginning before it is fed through the attention layers, and can be accessed as v.pos_embedding
@marcomameli1992 actually, let me just write up a layer extractor that can wrap the ViT and return all these intermediates, similar to https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/recorder.py
@marcomameli1992 does this work for you? https://github.com/lucidrains/vit-pytorch/tree/0.25.1#accessing-embeddings