In Tesseract latest Nuget 4.1.1 package we are facing performance issue while getting HOCR text from page.
pjoshi90 opened this issue · comments
We have 20page tiff file.when we try to perform ocr operation for getting hocr text for page it will take around 4sec for each page which give us overall performance impact for performing ocr operation for multi-page file
You could try https://github.com/Sicos1977/TesseractOCR that one is updated to the latest Tesseract version. Don't know if it makes any difference though. You probably need to rewrite some code (expect not much) because I changed some things.
Thanks @Sicos19 its work as expected
You could try https://github.com/Sicos1977/TesseractOCR that one is updated to the latest Tesseract version. Don't know if it makes any difference though. You probably need to rewrite some code (expect not much) because I changed some things.