Jiapeng Wang (jpWang)

jpWang

Geek Repo

Company:South China University of Technology

Location:Guangzhou, China

Github PK Tool:Github PK Tool


Organizations
SCUT-DLVCLab

Jiapeng Wang's starred repositories

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:7419Issues:0Issues:0

VILA

VILA - A multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:196Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:21358Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8550Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:22251Issues:0Issues:0

spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Language:PythonLicense:MITStargazers:29104Issues:0Issues:0

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonLicense:Apache-2.0Stargazers:2677Issues:0Issues:0

self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合**宝宝的部署教程

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:5573Issues:0Issues:0
Language:PythonStargazers:395Issues:0Issues:0

Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

Stargazers:637Issues:0Issues:0

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:1072Issues:0Issues:0

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2821Issues:0Issues:0

Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Language:PythonLicense:Apache-2.0Stargazers:688Issues:0Issues:0

RFUND

Official release of RFUND introduced in the paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction" (arXiv:2401.03472).

Stargazers:14Issues:0Issues:0

baselines

The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."

Language:PythonLicense:MITStargazers:36Issues:0Issues:0

Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

Stargazers:435Issues:0Issues:0

InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

Language:PythonStargazers:1877Issues:0Issues:0

InternVideo

Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1073Issues:0Issues:0

docile

DocILE: Document Information Localization and Extraction Benchmark

Language:PythonLicense:MITStargazers:113Issues:0Issues:0

seqeval

A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)

Language:PythonLicense:MITStargazers:1058Issues:0Issues:0

MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫

Language:PythonLicense:NOASSERTIONStargazers:14652Issues:0Issues:0

InstructDoc

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)

Language:PythonLicense:NOASSERTIONStargazers:129Issues:0Issues:0

vrdu

We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datasets that represent several challenges: rich schema including diverse data types, complex templates, and diversity of layouts within a single document type.

Stargazers:67Issues:0Issues:0

GLM

GLM (General Language Model)

Language:PythonLicense:MITStargazers:3083Issues:0Issues:0

ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:12888Issues:0Issues:0

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonLicense:Apache-2.0Stargazers:5520Issues:0Issues:0

Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

Language:PythonLicense:Apache-2.0Stargazers:4016Issues:0Issues:0

fastText

Library for fast text representation and classification.

Language:HTMLLicense:MITStargazers:25685Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:12210Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4146Issues:0Issues:0