There are 13 repositories under document-intelligence topic.
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
ContextGem: Effortless LLM extraction from documents
A curated list of resources for Document Understanding (DU) topic
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
AI-in-a-Box leverages the expertise of Microsoft across the globe to develop and provide AI and ML solutions to the technical community. Our intent is to present a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction.
A collection of samples demonstrating techniques for processing documents with Azure AI including AI Foundry, OpenAI, Document Intelligence, etc.
ReadingBank: A Benchmark Dataset for Reading Order Detection
The Doc Intelligence in-a-Box project leverages Azure AI Document Intelligence to extract data from PDF forms and store the data in a Azure Cosmos DB. This solution, part of the AI-in-a-Box framework by Microsoft Customer Engineers and Architects, ensures quality, efficiency, and rapid deployment of AI and ML solutions across various industries.
This sample demonstrates how to use Document Intelligence's Layout model to convert a PDF document, such as invoices, into Markdown, then use GPT-3.5 Turbo to extract structured JSON data using the Azure OpenAI Service.
A curated list of resources on Table Structure Recognition
BoundaryNet - A Semi-Automatic Layout Annotation Tool
AI-powered document intelligence platform for automated analysis, processing, and insights extraction from various document formats.
An explainable AI system that combines Graph Intelligence, Vector Search, and Retrieval-Augmented Generation (RAG) to deliver grounded answers and transparent reasoning paths. Includes a FastAPI backend, Streamlit UI, FAISS vector index, and an in-memory knowledge graph for hybrid retrieval and recommendations.
An experiment to provide the capabilities of Azure AI Document Intelligence Studio template training for feedback loop
A curated list of resources on Document Layout Analysis
Using Azure Document Intelligence and Azure OpenAI services to automatically extract data from invoices.
Comprehensive learning hub for Azure AI services - 130+ labs and tutorials covering AI-102 certification
StackRAG is a multi-tenant Retrieval-Augmented Generation (RAG) platform for financial document intelligence. It extracts structured data from financial PDFs using LLMs, offers secure multi-tenancy, real-time APIs, and is built on Python, FastAPI, Docker, and PostgreSQL.
Extract and summarise data from PDFs and images using OCR + LLMs. Built with Python, OpenCV, HuggingFace, and Flask.
Enterprise-grade RAG system featuring dual online/offline operation, multi-modal document processing, and advanced AI capabilities including knowledge graph construction and hybrid search for intelligent document analysis.
Advanced multimodal RAG system for querying PDF documents with text, images, and tables using vector embeddings, semantic chunking, and LLMs via Groq API
App used to extract structured data from documents photos or pdfs via custom templating and commercial LLM (GPT and Azure Document Intelligence). Developed as a Computer Science Thesis at University of Bologna
Azure AI Samples
DocuMind is a document intelligence app where users can upload files, extract knowledge, and query them in natural language, combining semantic search (Qdrant), graph insights (Neo4j), and LLM reasoning.
A collection of solutions that leverage Azure AI services.
Agentic AI system that allows users to upload documents (PDFs, DOCX, etc.) and natural language questions. It uses LLM-based RAG to extract relevant information. The architecture includes multi-agent components such as document retrievers, summarizers, web searchers, and tool routers — enabling dynamic reasoning and accurate responses.
A live, evolving collection of open-source AI agents and real examples showing how businesses can use AI to automate work, save time, and explore new ideas.
IP and use case assets for CSU
Hands-on labs and mini hackathon to build a Sales Buddy Agent using Copilot Studio and Azure AI
Enterprise AI assistant for intelligent document Q&A via Slack - Advanced RAG system with multi-language support.
A comprehensive, production-ready Python pipeline for converting various document formats into clean, validated, and optimally chunked Markdown files ready for Large Language Model (LLM) consumption and NotebookLM notebooks.
🔍 AI-powered document search with semantic understanding. Find files by content using Sentence-BERT. Modern PyQt6 GUI with keyboard shortcuts, search history, and context menu. Supports PDF, DOCX, TXT. 92% precision with AI.
This project answers natural-language questions over your Excel inventory and business PDFs using a hybrid RAG pipeline. It combines semantic embeddings (FAISS) with BM25 for exact IDs, extracts structured fields (e.g., totals, GST) from PDFs and builds an explainable relationship graph; results can be exported to Neo4j for graph exploration.
The AI Document Intelligence Platform is an enterprise-oriented MVP that automates extraction, analysis, and summarization of business documents.