Code release for "VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment" [TMLR, 2023]
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool