Double_V (LDOUBLEV)

LDOUBLEV

Geek Repo

Company:huazhong univisity of science and technology

Location:wuhan,hubei

Home Page:https://blog.csdn.net/qq_25737169

Github PK Tool:Github PK Tool

Double_V's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:44331Issues:295Issues:638

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:39428Issues:394Issues:1283

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:32140Issues:343Issues:290

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonLicense:BSD-3-ClauseStargazers:24949Issues:218Issues:441

pyecharts

🎨 Python Echarts Plotting Library

Language:PythonLicense:MITStargazers:14479Issues:379Issues:1858

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:10256Issues:151Issues:152

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonLicense:MITStargazers:8141Issues:68Issues:185

Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

Language:PythonLicense:MITStargazers:6126Issues:59Issues:124

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonLicense:Apache-2.0Stargazers:5168Issues:67Issues:375

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonLicense:Apache-2.0Stargazers:2588Issues:26Issues:318

cv_note

记录cv算法工程师的成长之路,分享计算机视觉和模型压缩部署技术栈笔记。https://harleyszhang.github.io/cv_note/

Language:C++License:Apache-2.0Stargazers:2237Issues:29Issues:4

awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

DragDiffusion

[CVPR2024, Highlight] Official code for DragDiffusion

Language:PythonLicense:Apache-2.0Stargazers:1031Issues:26Issues:53

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:968Issues:29Issues:56

Questgen.ai

Question generation using state-of-the-art Natural Language Processing algorithms

Language:PythonLicense:MITStargazers:874Issues:27Issues:48
Language:PythonLicense:NOASSERTIONStargazers:681Issues:8Issues:61

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

OCR-SAM

Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting

PaddleTS

Awesome Easy-to-Use Deep Time Series Modeling based on PaddlePaddle, including comprehensive functionality modules like TSDataset, Analysis, Transform, Models, AutoTS, and Ensemble, etc., supporting versatile tasks like time series forecasting, representation learning, and anomaly detection, etc., featured with quick tracking of SOTA deep models.

Language:PythonLicense:Apache-2.0Stargazers:453Issues:20Issues:153

grounded-segment-any-parts

Grounded Segment Anything: From Objects to Parts

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:362Issues:5Issues:10

text2text

Text2Text: Crosslingual NLP/G toolkit

Language:PythonLicense:NOASSERTIONStargazers:274Issues:10Issues:33

OneChart

official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"

Language:PythonLicense:Apache-2.0Stargazers:86Issues:1Issues:5

Kosmos2.5

My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"

Language:PythonLicense:MITStargazers:58Issues:1Issues:1
Language:PythonLicense:MITStargazers:51Issues:4Issues:12

Transformer_Distillation

Knowledge Distillation For Transformer Language Models

SciCap

SciCap Dataset

Stargazers:47Issues:0Issues:0

docvqa-gen

Question Answering dataset generator of Document Visual in English and Chinese

Language:Jupyter NotebookStargazers:22Issues:3Issues:1