There are 0 repository under textvqa topic.
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
[PRL 2024] This is the code repo for our label-free pruning and retraining technique for autoregressive Text-VQA Transformers (TAP, TAP†).