There are 1 repository under pdftotext topic.
Image to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/
Fast and memory-efficient Python PDF Parser based on xpdf sources
Batch-convert pdf to text, extract data from pdf in python
Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features
A Python asyncio wrapper for Tesseract-OCR.
Deprecated - A fast API service for retrieving day to day stats about Coronavirus(COVID-19, SARS-CoV-2) outbreak in Kerala(India).
A mirror of https://git.tecosaur.net/tec/pdftotext.el
A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.
A PDF to text converter for Scriptable App (iOS) working offline
Convert documents (pdf, djvu, epub, word) to txt
Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents
Extract text from a PDF (pdf to text). Api for PHP/JS/Python and others.
This project for converting books from PDF to Proper JSON objects by separating title and content. After you take your output, you can insert your JSON file in the database easily.
Convert scanned pdf into text embedded pdf.
A simple RESTFul API service for poppler
Converts an image to a CSV. This exists because Chorus 3.0 is bat-shit and only show images for vital metadata.
All scrapers for covid19
Computer application built in python to open, edit and convert a document in pdf to microsoft word format. GUI is designed using Tkinter. Opening, conversion and reading of pdf flies is carried out by a python library called PyPDF2
Python Audio Book is a script, to convert PDF texts into Speech
Python library for reading CIPRS PDFs
Meu projeto do curso CS50: Um analisador de pdfs que processa as notas dos aprovados pelo Acesso Enem e organiza tudo. Agora em C++
"PDF To Audio" is a Python tool that transforms PDF documents into audio files using OCR and Text-to-Speech technology. Ideal for accessibility and auditory learning, it supports multiple languages, parallel processing, and smart rate limit handling.
NLP Pdf Minning Extracting text from pdf
Obtener estadísticas de cualificaciones de la USC
Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.
A containerised tool to extract text from PDF file using OCR Tesseract
Image to Text with Flask application
The Professor (Converter from PDF to Sound)
Heroku buildpack for poppler pdftotext utility
A simple WordPress PDF document manager.
An attempt to make OCR
This Python script utilizes the PyPDF2 library to convert PDF documents into plain text.