There are 2 repositories under docling topic.
A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.
Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
Docling with Ollama - RAG on Local Files with Local Models
Chat to your Database GenAI Chatbot
A python library and CLI tool to convert PDF files to CSV files.
Docling4j brings the functionalities of Docling in document understanding to Java® projects
Autonomous agent networks for task automation that requires multi-step reasoning
🧙♂️AI-powered tool to optimize your CV with job-specific keywords and align it to your dream job.
This repo contains codes for RAG using docling on colab notebook with langchain, milvus, huggingface embedding model and LLM
Parse documents using AI - any document converted to markdown suitable for RAG applications
DocChat is an AI-powered Multi-Agent RAG system using Docling for structured document parsing and BM25 + vector search retrievers to retrieve fact-checked answers from PDFs, DOCX, and text files, preventing hallucinations. 🚀
A Python script that converts PDF files to text using the docling library. This tool is designed to batch process PDF files, making it easy to extract text content from multiple documents at once.
This repo provide RAG using Docling, langchain, milvus, sentence transformers, huggingface LLMs
Repository for testing and demonstrating the capabilities of Docling for document conversion.
Agentic RAG-based system with nursing handbooks and transes as knowledge base for my bebiloves
📄 A template for project for creating a chainlit application, using a locally run model via ollama and qdrant vector database for document retrieval.
This repo contains google colab notebook for handing Docling for data extraction such as text, image, table etc.
This project is an AI-powered Contract Risk Assessment and Legal Assistant designed to analyze legal documents, extract key clauses, assess risks, and provide actionable recommendations. Additionally, a fine-tuned conversational chatbot is integrated for interactive legal Q&A based on contract-specific knowledge.
Retrieval-Augmented Generation server with Pinecone and OpenAI