document-understanding

There are 22 repositories under document-understanding topic.

infiniflow / ragflow
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
document-understanding llm rag deep-learning document-parser retrieval-augmented-generation agent graphrag ai-search deepseek deepseek-r1 ollama ai agentic-ai mcp openai agentic deep-research agentic-workflow multi-agent
Language:TypeScript 64283
deepdoctection / deepdoctection
A Repo For Document AI
document-parser document-image-analysis table-recognition ocr document-ai document-understanding python document-layout-analysis table-detection pytorch tensorflow publaynet pubtabnet layoutlm nlp
Language:Python 2957
X-PLUG / mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
chart-understanding document-understanding mllm multimodal multimodal-large-language-models table-understanding
Language:Python 2150
AlibabaResearch / AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
artificial-intelligence documentai multimodal multimodal-deep-learning ocr computer-vision vision-language-transformer end-to-end-ocr scene-text-detection scene-text-detection-recognition scene-text-recognition text-detection text-recognition vision-language document document-analysis document-recognition document-understanding document-intelligence vision-language-model
Language:C++ 1769
tstanislawek / awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
awesome-list machine-learning information-extraction key-information-extraction document-understanding robotic-process-automation document-analysis document-layout-analysis ocr natural-language-processing deep-learning nlp awesome pdf rpa pdf-documents document-intelligence unstructured-data intelligent-processing document-ai
1457
OpenBMB / VisRAG
Parsing-free RAG supported by VLMs
rag retrieval retrieval-augmented-generation vision-language-model multi-modal multi-modality document-retrieval document-understanding
Language:Python 786
wenwenyu / PICK-pytorch
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
key-information-extraction document-analysis graph-neural-networks graph-convolutional-network graph-learning document-understanding
Language:Python 568
jpWang / LiLT
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
nlp document-ai document-analysis document-understanding information-extraction multimodal-pre-trained-model multilingual-models
Language:Python 355
document-ai-samples
GoogleCloudPlatform / document-ai-samples
Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
document-understanding machine-learning ocr pdf python samples
Language:Jupyter Notebook 267
SCUT-DLVCLab / Document-AI-Recommendations
Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.
document-ai document-understanding key-information-extraction table-structure-recognition visual-information-extraction
186
MathamPollard / awesome-table-structure-recognition
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.
table-detection table-structure-recognition table-extraction table-functional-analysis document-understanding
174
huggingface / chug
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
computer-vision dataloading datasets distributed-training document-understanding multi-modal-learning pdf-document webdataset
Language:Python 157
Alpha-Innovator / DocGenome
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
document-understanding question-answering paper-annotation
Language:Jupyter Notebook 144
andreagemelli / doc2graph
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
deep-learning document-understanding geometric-deep-learning gnn key-information-extraction layout-analysis nlp table-detection pytorch
Language:Jupyter Notebook 120
doc-analysis / ReadingBank
ReadingBank: A Benchmark Dataset for Reading Order Detection
ocr nlp natural-language-processing document-understanding document-ai document-intelligence
109
LynnHaDo / Document-Layout-Analysis
Object Detection Model for Scanned Documents
document-understanding object-detection python yolov8
Language:Jupyter Notebook 90
LynnHaDo / Checkbox-Detection
Checkbox Detection Model for Scanned Documents
document-understanding object-detection python yolov8 computer-vision copy-paste deep-learning
Language:Jupyter Notebook 64
microsoft / CompHRDoc
Datasets and Evaluation Scripts for CompHRDoc
document-structure-analysis document-understanding rag-related
Language:Python 36
ZeningLin / PEneo
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
document-ai document-understanding key-information-extraction ocr visual-information-extraction
Language:Python 28
NExTplusplus / TAT-DQA
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
document-understanding question-answering vqa
23
SCUT-DLVCLab / RFUND
[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"
document-understanding visual-information-extraction document-ai key-information-extraction ocr
19
uakarsh / TiLT-Implementation
Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.
deep-learning document-understanding pytorch-implementation pytorch-lightning transformers
Language:Jupyter Notebook 17
jacobmarks / pytesseract-ocr-plugin
Run optical character recognition with PyTesseract from the FiftyOne App!
computer-vision document-understanding fiftyone nlp ocr plugin python tesseract tesseract-ocr
Language:Python 10
dhorvay / document-understanding-ebook
(WIP) ✨ A comprehensive resource for understanding the world of software used in the Document Understanding field. 🧙✨
document-ai document-understanding awesome-document-understanding ebook ocr
Language:Markdown 5
irgroup / labelstudio-to-fonduer
This small module connects Label Studio with Fonduer by creating a fonduer labeling function for gold labels from a label studio export. Documentation: https://irgroup.github.io/labelstudio-to-fonduer/
data-annotation document-understanding fonduer knowledge-base-construction label-studio
Language:Python 5
javier-marti-isasi / OCR-free-Document-Understanding-with-Donut-Transformer
This project tackles a real-world challenge of automating client document processing, with a focus on enhancing document classification, error detection, data extraction, and validation.
document-classification document-understanding ocr
Language:Jupyter Notebook 5
bwnyasse / dart-documentai-samples
A hands-on CLI tool sample showcasing the integration of Dart with Google Cloud's DocumentAI.
dart dartlang document-ai document-understanding google-cloud machine-learning samples
Language:Dart 3
ExtrieveTechnologies / QuickCapture_IOS
QuickCapture Mobile Scanning SDK Specially designed for native IOS
document-classification document-scanner-app document-scanning-sdk document-understanding ios objective-c swift
Language:Objective-C 2
ExtrieveTechnologies / QuickCapture_Android
QuickCapture Mobile Scanning SDK Specially designed for native ANDROID from Extrieve
document-scanning-sdk document-scanner-app document-scanner document-understanding android java kotllin
Language:Kotlin 1
mycielski / textract_study
Analysing expense reports/invoices with AWS Textract and boto3.
boto3 document-understanding expenses invoices textract aws aws-cli script shell
Language:Python 1
TomQuez / LLM_document_understanding
benchmarking document-understanding html llm pdf-converter
Language:HTML 1
phong-lt / LiGT_VQA
This repository includes the ReceiptVQA dataset and the Pytorch implementation of the LiGT method and other evaluated baselines.
document-understanding vietnamese-language visual-question-answering
Language:Python 0
msamoncenko / ReadInvoiceProject
This project automates the processing of Invoices using the Dispatcher - Performer model in UiPath and Document Understanding. The process involves asking the user for a date, navigating to a website to upload invoices where the Due Date matches a given condition, and adding each invoice as a queue item in the Orchestrator.
document-understanding reframework uipath
Language:HTML

document-understanding

infiniflow / ragflow

deepdoctection / deepdoctection

X-PLUG / mPLUG-DocOwl

AlibabaResearch / AdvancedLiterateMachinery

tstanislawek / awesome-document-understanding

OpenBMB / VisRAG

wenwenyu / PICK-pytorch

jpWang / LiLT

GoogleCloudPlatform / document-ai-samples

SCUT-DLVCLab / Document-AI-Recommendations

MathamPollard / awesome-table-structure-recognition

huggingface / chug

Alpha-Innovator / DocGenome

andreagemelli / doc2graph

doc-analysis / ReadingBank

LynnHaDo / Document-Layout-Analysis

LynnHaDo / Checkbox-Detection

microsoft / CompHRDoc

ZeningLin / PEneo

NExTplusplus / TAT-DQA

SCUT-DLVCLab / RFUND

uakarsh / TiLT-Implementation

jacobmarks / pytesseract-ocr-plugin

dhorvay / document-understanding-ebook

irgroup / labelstudio-to-fonduer

javier-marti-isasi / OCR-free-Document-Understanding-with-Donut-Transformer

bwnyasse / dart-documentai-samples

ExtrieveTechnologies / QuickCapture_IOS

ExtrieveTechnologies / QuickCapture_Android

mycielski / textract_study

TomQuez / LLM_document_understanding

phong-lt / LiGT_VQA

msamoncenko / ReadInvoiceProject