ocr is a tool for parsing structured information from (messy) OCR outputs. This toolkit doesn't use fancy deep learning models. It focuses on simple and efficient algorithms that are practical enough to be used in battle.
This modules focuses on approximate string matching. Not only does it give the ability to calculate distances between words, it also records the operations that were performed to transform one word into another.
git clone https://github.com/MaxHalford/orc
cd orc
pip install poetry
poetry install
poetry shell
pytest
The MIT License (MIT). Please see the license file for more information.