clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Home Page:https://arxiv.org/abs/2111.15664

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VisionEncoderDecoderModel convert

sjtu-cz opened this issue · comments

How to convert the trained donut model into the model structure of VisionEncoderDecoderModel?

Smells like an xy-problem, what exactly are you trying to do? Importing a donut model with the huggingface VisionEncoderDecoder implementation should be straight forward. Just make sure you use the right DonutTokenizer with it. The docs should cover what you are looking for:
https://huggingface.co/docs/transformers/model_doc/donut