Integrate a customized internal OCR engine to Donut

Question

Integrate a customized internal OCR engine to Donut

Altimis opened this issue 4 months ago · comments

Yassine Ait Jeddi commented 4 months ago

Hello guys. Thank you so much for this brilliant Model.
I'm aware that Donut is an OCR-free model which does not rely on an OCR input. When I performed some tests (fine-tuning the model), I realized that the internal OCR-engine performance is not as good as Google Cloud Vision OCR. Is is possible to change the OCR engine by this one ? Thanks you !

Felix · Answer 1 · Sat Feb 03 2024 04:47:51 GMT+0800 (China Standard Time)

Donut is not made to compete with OCR engines, it is pre-trained on generating OCR to give it a general understanding about characters and language that can be leveraged in fine tuning tasks, like extracting a specific information from an input image. If you want good OCR, I would recommend sticking to tesseract or cloud solutions like the one you suggested.