Negative (invert image)

Question

Negative (invert image)

DJuego opened this issue 4 years ago · comments

Congratulations on such a promising tool! Thanks for the effort!

It seems that, sometimes, dark text/light background works better than light text/dark background (or viceversa) depending on the specific sample.

It seems interesting that there is an option to "invert" the image clip (negative) before sending it to Tesseract (checkbox or so). Is it possible?

DJuego

Daniel Plakhotich · Answer 1 · Sun Jun 07 2020 04:59:46 GMT+0800 (China Standard Time)

I'd rather avoid adding new options to the interface, unless a feature is 100% useful.

Because of how Tesseract's algorithm works, small changes in image may lead to dramatically different OCR results. It's so unpredictable that an option to invert image would be nearly useless in practice. If the recognized text is 50% garbage, inverting an image is unlikely to make a big enough difference to spend time toggling a checkbox and make OCR again, and you don't even know whether there will be an improvement or degradation.

In my experience, in most cases Tesseract is surprisingly good at guessing the best result, regardless of whether an image has black text on white background or vice versa, and most of the errors happen due to uncommonly looking fonts.

DJuego · Answer 2 · Sun Jun 07 2020 05:09:20 GMT+0800 (China Standard Time)

Thank you for your detailed answer. I understand your arguments. And I have to admit that I find them quite reasonable. I'm satisfied!

DJuego

Daniel Plakhotich · Answer 3 · Mon Jul 25 2022 00:58:43 GMT+0800 (China Standard Time)

FYI. It turns out that Tesseract already does inversion under the hood if recognizing the original image doesn't give a good enough result.

There's even the invert_threshold option to control this behavior:

tesseract-ocr/tesseract@96861b5