There are 249 repositories under ocr topic.
Tesseract Open Source OCR Engine (main repository)
Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Pure Javascript OCR for more than 100 Languages 📖🎉🖥
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
ShareX is a free and open-source application that enables users to capture or record any area of their screen with a single keystroke. It also supports uploading images, text, and various file types to a wide range of destinations.
A community-supported supercharged document management system: scan, index and archive all your documents
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words and translating text.
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Scan, index, and archive all of your paper documents
Trained models with fast variant of the "best" LSTM models + legacy models
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
yolo3+ocr
截屏 离线OCR 搜索翻译 以图搜图 贴图 录屏 万向滚动截屏 屏幕翻译 Screenshot Offline OCR Search Translate Search for picture Paste the picture on the screen Screen recorder Omnidirectional scrolling screenshot Screen translator 支持Windows Linux macOS
All-in-One Development Tool based on PaddlePaddle
A Unified Toolkit for Deep Learning Based Document Image Analysis
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
A supercharged version of paperless: scan, index and archive all your physical documents