There are 1 repository under pypdf2 topic.
Benchmarking PDF libraries
Multiple and Large PDF Documents Text Extraction.
A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.
Remove PDF watermarks from academic papers using pypdf
A cross-platform utility to join, split, stamp, and rotate PDFs written in Python. Yes, Python!
pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.
This is a complete website in which you can chat with pdf, extract meta data, text, links, image, and lot more . Check my blog for more details: https://medium.com/@amit.2503719/allaboutpdf-tool-for-data-extraction-and-talking-to-pdf-using-chatpdf-feature-f2daea15a59c
Batch-convert pdf to text, extract data from pdf in python
Prepare documents for distribution
:rocket:Parse PDFs, Word and Excel documents. Read, Create, Merge/Combine, Extract data from office documents.
This Flask App would remove CamScanner watermark from scanned pdfs.
Simple pdf to text with python using PDFtk and PyPDF2
pdf文件处理工具, 包含: pdf剪切, pdf旋转, pdf合并, pdf拆分, pdf添加页码, pdf转图片, word转pdf等功能
A simple and offline PDF audio reader
A desktop application which helps students to choose Disciplinary and Open Electives wisely.
Simple Python GUI Tool for Tesseract4
get local e paper ( Dainik Jagron and Hindustan )
A script to convert MS Office PPT/PPTX files to PDF files and then merge all the PDF files to a single PDF file.
Smart ATS evaluates resumes against job descriptions, providing match percentage, missing keywords, and improvement suggestions.
A python script to convert the KCT's(Kumaraguru college of technology) academic calendar pdf file into a csv file and will sync the events with google calendar.
You can convert from a PDF to MP3 file using this python code
A script that generates a pdf file. You can create a new pdf file from an html file or you can write on top of an already existing pdf
Simple python utilities to play around with PDF Files
This Repository consists of some Python Beginner Level Projects.
The "MCQ Generator with Streamlit" web app utilizes OpenAI's language models to create multiple-choice questions (MCQs) from uploaded PDF or text files. Users can customize question parameters like quantity, subject, and tone. The app offers real-time complexity feedback and presents MCQs in an easy-to-read tabular format.
RAG chatbot using Llama 2, chainlit and Faiss
Code used in my Medium Story https://medium.com/@umerfarooq_26378/python-for-pdf-ef0fac2808b0
This repository contains a Python script that extracts the cover photo from a PDF file and saves it as a PNG image. It uses the pdf2image and PyPDF2 packages and can process multiple PDF files at once.
NLP model for extracting chinese datas from the documents
This project facilitates the extraction of text from PDF files using various Python libraries. It is designed to be flexible, allowing the choice among different text extraction libraries and supporting both single PDF file and directory containing multiple PDF files.
This project is a generative AI chatbot that specializes in extracting and comprehending information from PDF documents. It allows users to upload multiple PDF files, trains on the content of those documents, and enables them to ask questions or make queries related to the PDFs' content.