pdf-ocr-extraction

There are 1 repository under pdf-ocr-extraction topic.

skylander86 / lambda-text-extractor
AWS Lambda functions to extract text from various binary formats.
text-extraction aws-lambda searchable-pdfs ocr lambda-functions pdf pdf-ocr-extraction tesseract
Language:Python 173
Clearedge-AI / clearedge
Build a RAG preprocessing pipeline
document-parser haystack langchain llamaindex llm ocr pdf pdf-ocr-extraction pdf-to-json pdf-to-text rag-pipeline retrieval-augmented-generation table-detection table-recognition
Language:Jupyter Notebook 10
omaxel / pdf-ocr
Recognize page content of a PDF as text using Tesseract and Ghostscript.
csharp ghostscript ocr pdf pdf-ocr-extraction tesseract-ocr
Language:C# 7
Achiwilms / OCR-Wizard
A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.
ocr-pdf ocr-python ocr-recognition ocrmypdf pdf pdf-ocr pdf-ocr-extraction python searchable-pdf
Language:Python 4
fsdesa / pdf-ocr-service
PDF OCR service in docker
afip docker factura-afip java ocr pdf pdf-ocr-extraction
Language:Java 1
lakshay1296 / OCR_Django_App_Beta
Example Django-Python project which contains OCR, PDF to OCR PDF, Text Similarity/Dissimilarity, PDF to PNG converter modules.
django-application django-project html-css-javascript imagemagick ocr-python ocr-recognition pdf-ocr-extraction python27
Language:Python 1
Firefox-1998 / UtilityPDF
Utility with collect in one place, some operations that are normally done on PDF files.
compress convert csharp docx merge ocr pdf pdf-compression pdf-converter pdf-merge pdf-ocr-extraction rtf utility
Language:C#
mcagriaksoy / diff_merge_pdf
A tool for compare, merge, display difference and make OCR between the PDFs.
diff-tool diff-tool-pdf ocr-recognition ocr-text-reader pdf-comparison pdf-document-processor pdf-generator pdf-merger pdf-ocr pdf-ocr-extraction pdf-viewer pdf-visual-testing pymupdf-fitz pyqt6-desktop-application x-ray-images
Language:Python