wangtao's starred repositories
DocLayout-YOLO
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
agentscope
Start building LLM-empowered multi-agent applications in an easier way.
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
WSOD-Paper-List
A paper list of state-of-the-art weakly supervised object detection or localization.
mathpix-markdown-it
Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community
GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
vue-advanced-cropper
The advanced vue cropper library that gives you opportunity to create your own croppers suited for any website design
llm_aided_ocr
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
360LayoutAnalysis
360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute
llmdocparser
A package for parsing PDFs and analyzing their content using LLMs.
llm-graph-builder
Neo4j graph construction from unstructured data using LLMs
knowledge_graph
Convert any text to a graph of knowledge. This can be used for Graph Augmented Generation or Knowledge Graph based QnA
pdf_parsing
PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取
named_entity_recognition
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
awesome-chinese-ner
中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。