multimodal-pre-trained-model

There are 1 repository under multimodal-pre-trained-model topic.

clovaai / donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
document-ai eccv-2022 multimodal-pre-trained-model ocr nlp computer-vision
Language:Python 5584
jpWang / LiLT
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
document-ai document-analysis document-understanding information-extraction multilingual-models multimodal-pre-trained-model nlp
Language:Python 333
marslanm / Multimodality-Representation-Learning
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
cross-modal multimodal-datasets multimodal-deep-learning multimodal-pre-trained-model transformer-models vision-language-pretraining multimodal-applications multimodal-pretext
64

clovaai / donut