wangtao2001

wangtao's starred repositories

DocLayout-YOLO

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Language:PythonAGPL-3.045300

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Language:PythonAGPL-3.01421800

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Language:PythonAGPL-3.0558100

domain-adaptation-on-yolo-tiny

Language:Python100

agentscope

Start building LLM-empowered multi-agent applications in an easier way.

Language:PythonApache-2.0525000

swarm

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Language:PythonMIT1598000

Awesome-Chinese-LLM

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

1595800

WSOD-Paper-List

A paper list of state-of-the-art weakly supervised object detection or localization.

8800

mathpix-markdown-it

Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community

Language:JavaScriptMIT53300

GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Language:Python596300

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT7115100

pdfdir

PDF导航（大纲/目录）添加工具

Language:PythonGPL-3.058300

vue-advanced-cropper

The advanced vue cropper library that gives you opportunity to create your own croppers suited for any website design

Language:VueNOASSERTION100100

llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Language:Python217200

360LayoutAnalysis

360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute

Apache-2.023800

llmdocparser

A package for parsing PDFs and analyzing their content using LLMs.

Language:PythonMIT23300

zs-design-ui

Language:TypeScript9600

omniparse

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

Language:PythonGPL-3.0557200

PhT_LM

Language:Python200

llm-graph-builder

Neo4j graph construction from unstructured data using LLMs

Language:Jupyter NotebookApache-2.0235300

GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Language:PythonApache-2.0523700

YAYI-UIE

雅意信息抽取大模型：在百万级人工构造的高质量信息抽取数据上进行指令微调，由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)

Apache-2.027000

knowledge_graph

Convert any text to a graph of knowledge. This can be used for Graph Augmented Generation or Knowledge Graph based QnA

Language:Jupyter Notebook150800

MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Language:Jupyter NotebookApache-2.0712100

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonApache-2.01256200

pdf_parsing

PDF解析（文字，章节，表格，图片，参考），基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答，摘要，信息抽取

Language:Python15600

named_entity_recognition

中文命名实体识别（包括多种模型：HMM，CRF，BiLSTM，BiLSTM+CRF的具体实现）

Language:Python213600

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonApache-2.02262600

awesome-chinese-ner

中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集，以及中文预训练模型、词向量、实体识别综述等。

60500

DeepIE

DeepIE: Deep Learning for Information Extraction

Language:Python194000