danpla / dpscreenocr

Program to recognize text on screen

Home Page:https://danpla.github.io/dpscreenocr/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Negative (invert image)

DJuego opened this issue · comments

Congratulations on such a promising tool! Thanks for the effort!

It seems that, sometimes, dark text/light background works better than light text/dark background (or viceversa) depending on the specific sample.

It seems interesting that there is an option to "invert" the image clip (negative) before sending it to Tesseract (checkbox or so). Is it possible?

DJuego

I'd rather avoid adding new options to the interface, unless a feature is 100% useful.

Because of how Tesseract's algorithm works, small changes in image may lead to dramatically different OCR results. It's so unpredictable that an option to invert image would be nearly useless in practice. If the recognized text is 50% garbage, inverting an image is unlikely to make a big enough difference to spend time toggling a checkbox and make OCR again, and you don't even know whether there will be an improvement or degradation.

In my experience, in most cases Tesseract is surprisingly good at guessing the best result, regardless of whether an image has black text on white background or vice versa, and most of the errors happen due to uncommonly looking fonts.

Thank you for your detailed answer. I understand your arguments. And I have to admit that I find them quite reasonable. I'm satisfied!

DJuego

FYI. It turns out that Tesseract already does inversion under the hood if recognizing the original image doesn't give a good enough result.

There's even the invert_threshold option to control this behavior:

tesseract-ocr/tesseract@96861b5