charlesw / tesseract

A .Net wrapper for tesseract-ocr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

In Tesseract latest Nuget 4.1.1 package we are facing performance issue while getting HOCR text from page.

pjoshi90 opened this issue · comments

We have 20page tiff file.when we try to perform ocr operation for getting hocr text for page it will take around 4sec for each page which give us overall performance impact for performing ocr operation for multi-page file

commented

You could try https://github.com/Sicos1977/TesseractOCR that one is updated to the latest Tesseract version. Don't know if it makes any difference though. You probably need to rewrite some code (expect not much) because I changed some things.

Thanks @Sicos19 its work as expected

You could try https://github.com/Sicos1977/TesseractOCR that one is updated to the latest Tesseract version. Don't know if it makes any difference though. You probably need to rewrite some code (expect not much) because I changed some things.