Zhe Gan's starred repositories
GenerativeImage2Text
GIT: A Generative Image-to-text Transformer for Vision and Language
ViT-Adapter
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
pytorch_violet
A PyTorch implementation of VIOLET
Stable-Pix2Seq
A full-fledged version of Pix2Seq
Focal-Transformer
[NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"