Ocr's some numbers as text strings
allhavebrainimplantsandmore opened this issue · comments
Current Behavior
I know there are settings to tweak and things you play with but it's so strange that only a tiny fraction of numbers get recognized as text strings by default. Is there any way to report and collect what tesseract's default recognition misses so it can update the engine to improve it? I'm literally seeing some patterns in what tesseract misses and maybe it can be updated and improved so it works better and more flawless out of the box?
Expected Behavior
No response
Suggested Fix
A way to report misses in default tesseract behavior to update its recognition engine for all to benefit.
tesseract -v
tesseract 5.3.2
Operating System
No response
Other Operating System
Fedora 39
uname -a
No response
Compiler
No response
CPU
No response
Virtualization / Containers
No response
Other Information
No response
Closing as the reporter does not provide anything that we can reproduce.