There are 5 repositories under pdf-documents topic.
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
A curated list of resources for Document Understanding (DU) topic
Python bindings to PDFium, reasonably cross-platform.
Analyze PDFs. With colors. And Yara.
Up-to-date Laravel documentation in PDF format (all versions)
Malicious PDF files recently considered one of the most dangerous threats to the system security. The flexible code-bearing vector of the PDF format enables to attacker to carry out malicious code on the computer system for user exploitation.
A painless HTML to PDF rendering service. Generate PDF reports and documents from HTML templates or raw HTML.
A plugin for Flutter that allows you to read the text content of PDF documents and convert it into strings.
A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.
:lock_with_ink_pen: Sign PDFs with the portuguese citizen card (aka "cartĂŁo de cidadĂŁo")
Simple frontend for OCRmyPDF (Windows only).
Lightweight Helper classes based on iTextSharp for scaling and resizing Pdf Documents & Pages.
CLI program for searching inside text and tables in PDF documents and displaying results in HTML.
Use a PdfGraphics object to add interactive form fields to a PDF document.
Use the Azure Key Vault API to sign a PDF document.
Extract the text of a PDF document and count the words' occurrences in a document text.
Use the PdfDocumentProcessor to add a visual signature to a document.
Access and modify custom document properties.
Implement a custom signer based on the Bouncy Castle C# API and use a custom digest calculator to calculate a document hash.
Create a custom timestamp client based on the Bouncy Castle C# API.
Customize print output and specify settings for a specific document page.
Export a PDF document to multi-page Tiff and bitmap images.
Extract the first page from a PDF document into a separate document.
Use the PDF Document API to create a document with graphics in code.
Obtain a checked appearance name for a check box and specify the check box value.
Use the PDF Document API to apply multiple PKCS#7 signatures with X.509 certificates.
Use the PDF Document API to rotate document pages and save the result.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Use the PDF Document API to embed XMP metadata in PDF documents.
This project offers several versions of PDFBox source code that can be compiled with Eclipse. The complete version is a complete unmodified PDFBox with all packages normally not included in PDFBox source code. The other versions are modified versions offering more capabilities.