wangtao (wangtao2001)

wangtao2001

Geek Repo

Company:China Pharmaceutical University

Location:Nanjing, China

Github PK Tool:Github PK Tool


Organizations
CPU-DS
JSREI

wangtao's starred repositories

DocLayout-YOLO

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Language:PythonLicense:AGPL-3.0Stargazers:453Issues:0Issues:0

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Language:PythonLicense:AGPL-3.0Stargazers:14218Issues:0Issues:0

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Language:PythonLicense:AGPL-3.0Stargazers:5581Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

agentscope

Start building LLM-empowered multi-agent applications in an easier way.

Language:PythonLicense:Apache-2.0Stargazers:5250Issues:0Issues:0

swarm

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Language:PythonLicense:MITStargazers:15980Issues:0Issues:0

Awesome-Chinese-LLM

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

Stargazers:15958Issues:0Issues:0

WSOD-Paper-List

A paper list of state-of-the-art weakly supervised object detection or localization.

Stargazers:88Issues:0Issues:0

mathpix-markdown-it

Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community

Language:JavaScriptLicense:MITStargazers:533Issues:0Issues:0

GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Language:PythonStargazers:5963Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:71151Issues:0Issues:0

pdfdir

PDF导航(大纲/目录)添加工具

Language:PythonLicense:GPL-3.0Stargazers:583Issues:0Issues:0

vue-advanced-cropper

The advanced vue cropper library that gives you opportunity to create your own croppers suited for any website design

Language:VueLicense:NOASSERTIONStargazers:1001Issues:0Issues:0

llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Language:PythonStargazers:2172Issues:0Issues:0

360LayoutAnalysis

360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute

License:Apache-2.0Stargazers:238Issues:0Issues:0

llmdocparser

A package for parsing PDFs and analyzing their content using LLMs.

Language:PythonLicense:MITStargazers:233Issues:0Issues:0
Language:TypeScriptStargazers:96Issues:0Issues:0

omniparse

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

Language:PythonLicense:GPL-3.0Stargazers:5572Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

llm-graph-builder

Neo4j graph construction from unstructured data using LLMs

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2353Issues:0Issues:0

GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Language:PythonLicense:Apache-2.0Stargazers:5237Issues:0Issues:0

YAYI-UIE

雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)

License:Apache-2.0Stargazers:270Issues:0Issues:0

knowledge_graph

Convert any text to a graph of knowledge. This can be used for Graph Augmented Generation or Knowledge Graph based QnA

Language:Jupyter NotebookStargazers:1508Issues:0Issues:0

MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7121Issues:0Issues:0

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:12562Issues:0Issues:0

pdf_parsing

PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取

Language:PythonStargazers:156Issues:0Issues:0

named_entity_recognition

中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)

Language:PythonStargazers:2136Issues:0Issues:0

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:22626Issues:0Issues:0

awesome-chinese-ner

中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。

Stargazers:605Issues:0Issues:0

DeepIE

DeepIE: Deep Learning for Information Extraction

Language:PythonStargazers:1940Issues:0Issues:0