There are 5 repositories under scanned-documents topic.
Open Source Document Management System for Digital Archives (Scanned Documents)
ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.
Evaluate OMR sheets fast and accurately using a scanner 🖨 or your phone 🤳.
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
A super lightweight image processing algorithm for detection and extraction of overlapped handwritten signatures on scanned documents using OpenCV and scikit-image.
A curated list of awesome projects to simplify and improve paper and document scanning.
Papermerge DMS core backend, REST API server, and frontend UI
The first-ever paper on the Unix shell written by Ken Thompson in 1976 scanned, transcribed, and redistributed with permission
BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.
Make your PDFs look like they were scanned
Categorize your digital documents in a well designed UI, using modern technologies.
A document scanner that automatically trims the edge with perspective transform
Android Scanner with OCR support using PDFTron
A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.
Small utility to prepare scanned documents. Supports separating PDF files by separator pages and removing blank pages.
ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
Documentation for Papermerge DMS - Installation, Help, User Manual, REST API
Segmentation of Scanned Text upto Character Level
Efficient Text Localization Algorithm, Image Inversion Detection of Scanned Documents & Language Identification based on Shape Context and Traditional Computer Vision.
The project creates the models and service API for predicting scanned document images' angles ranging between -90° to 90° from the vertical.
Implementation of scanned document table segmentation with U-net
This is a Flask Based Project to convert Images, Scanned Documents or Multiple Page PDF into Searchable PDF
Papermerge DMS command line utility
This batch script creates a searchable PDF of a PDF with one or more scanned pages which contain images.
Searching for a text using OCR, detection and recognition of tables in scanned documents.
Converts scanned documents and ordinary documents into speech mp3 using Amazon Polly
A program to automate simple and repetitive tasks while scanning documents by Dallin Stewart
The web UI for Facile Search. Together with DocIndex, this UI can help you search the myriad of scanned documents you have been accumulating over the years. Using the power of Docker & Elasticsearch you can run a powerful search engine that lets you convert scanned (image-based) PDFs to searchable text, group documents by letterhead, run fuzzy searches by date and view document metadata.
auto-correct contrast and brightness of photographed document
{{scan|tools|software|headware|progress|open|template|log|log|log|softwaretool|}}{[[:wikt:Scan|log scan]]}. #[[:wikt:log scan|log copyright]]. *[[:wikt:log is log|log]]. *[[:wikt:log scan|txt]]. *[[:wikt:log scan|png]]. *[[:wikt:log scan|image image image/category user/category is /category talkname/category username/category done/category in progress/category open]]. -------------------------------------------------------------------------------------------------------------
Optical Character Recognition for Scanned Documents
Open Source Document Management System for Digital Archives (Scanned Documents)
This repository contains automation solutions that efficiently extracts text from scanned PDF documents with consistent layouts. Utilizing Tesseract OCR engine, the UiPath RPA robot achieves nearly 90% accuracy, streamlining the process and significantly reducing manual workload.
🧠 AI-powered pipeline for cleaning scanned documents. Removes noise, enhances text, auto-tunes model weights, and returns OCR-optimized PDFs via CLI or cloud API.