Beast code in Giters

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

Language:Jupyter Notebook30700

fastcomposer

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

Language:PythonMIT62600

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonApache-2.0309700

bubogpt

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

Language:PythonBSD-3-Clause49100

imageinwords

Data release for the ImageInWords (IIW) paper.

Language:JavaScript18400

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookMIT1378800

UESTC-Glasgow-Final-Year-Report-Template

电子科大格院毕设LaTeX模板

Language:TeXGPL-3.01300

Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Language:PythonApache-2.0421800

minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Language:PythonMIT1945800

Caption-Anything

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

Language:PythonBSD-3-Clause163300

NExT-Chat

The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".

Language:PythonApache-2.018100

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonApache-2.095700

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:Python223100

Prompt-Highlighter

[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Language:PythonMIT11000

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT384700

VoyageWang

VoyageWang's starred repositories

Paints-UNDO

F-LMM

chameleon

GLM-4

direct-preference-optimization

STIC

PhraseCutDataset

shikra

InstructBLIP_PEFT

MPP-LLaVA