All in one package for Document (image, pdf) Classification. Unified Interface for google ocr and tesseract. Train, evaluate, and infer using fasttext, Small language models (NER), Small Vision Language Models (layoutlm), and LLM.
https://pypi.org/project/document-classification/
Repository from Github https://github.comamit-timalsina/document_classification