There are 17 repositories under image-to-text topic.
A wrapper to work with Tesseract OCR inside PHP.
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
A Node.js wrapper for the Tesseract OCR API
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Data release for the ImageInWords (IIW) paper.
The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
L-Verse: Bidirectional Generation Between Image and Text
Codebase for fine-tuning / evaluating nougat-based image2latex generation models
A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning
OCR functionality in a feature-rich note-taking extension.
To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
OCR with Google's AI technology (Cloud Vision API)
The largest multilingual image-text classification dataset. It contains fashion products.
Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it will look the result of your conversion.
Tesseract.js OCR
Telegram bot to convert image to text using python
DangoOCR: screenshot OCR recognize 文字识别,支持多种语言,识别后翻译,播放声音
Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app
Inverse DALL-E for Optical Character Recognition
A little Python application to auto tag your photos with the power of machine learning.
Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot. Source code and Documentation for my 2023 ADUG Symposium Talk.
Stable Diffusion with Text-to-Image and Image-to-Text
Text to image generation and Image Captioning Android, iOS, Desktop and Web app using Compose Multiplatform with Clean Architecture
Image to text translator using Open AI API & Tesseract
Implementation of Fast ml-CCA from the ICCV-2015 work "Multi-Label Cross-Modal Retrieval"
In this system we need to enter an image(like government document) ,it can convert image data into string