There are 27 repositories under pdf-converter topic.
#1 Locally hosted web application that allows you to perform various operations on PDF files
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Get your documents ready for gen AI
PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等
A developer-friendly API for converting numerous document formats into PDF files, and more!
This repo isn't maintained anymore as phantomjs got dreprecated a long time ago. Please migrate to headless chrome/puppeteer.
borb is a library for reading, creating and manipulating PDF files in python.
Open source Python library for converting PDF to DOCX.
converts binary PDF to JSON and text, for server-side PDF processing and command-line use.
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
Open-source platform for extracting structured data from documents using AI.
A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.
An app to convert images to PDF file!
C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF.
Booktype is a free, open source platform that produces beautiful, engaging books formatted for print, Amazon, iBooks and almost any ereader within minutes.
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
📚 Process PDFs, Word documents and more with spaCy
Markdown to PDF command line app with support for stylesheets
🚜 Parse text and tables from PDF files.
Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
html转pdf , html转图片 , Docker-powered html convert to pdf(html2pdf), html to image(html2image like jpeg,png),which using chrome(golang) kernel.
Simple yet powerful automation stuffs.
Run LibreOffice in AWS Lambda to create PDFs & convert documents
Convenient HTML to PDF/A rendering library for Elixir based on Chrome & Ghostscript
pdfCropMargins -- a program to crop the margins of PDF files
Browse PDF document like a book turning its pages
Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG
Golang HTML to PDF Converter
.NET Core library to create custom reports based on Word docx or HTML documents and convert to PDF