Beast code in Giters

xyzjin's starred repositories

conv-llava

Language:PythonApache-2.09200

ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Language:PythonGPL-3.04578400

anno-free-AVS

Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"

Language:Python2100

GiT

🔥 [ECCV2024] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"

Language:PythonApache-2.026200

all-seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Language:Python43300

awesome-open-gpt

Collection of Open Source Projects Related to GPT，GPT相关开源项目合集🚀、精选🔥🔥

Language:Python542000

Awesome-ChatGPT

🤖 Awesome ChatGPT 中文全指南 🤖这是一个ChatGPT相关的持续更新知识库。如果你对该领域保持着兴趣欢迎关注并运用该知识库！

MIT33000

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonApache-2.0236500

mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Language:PythonApache-2.0112000

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonApache-2.0355100

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Language:Python71900

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonMIT194100

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.0749800

CLIPSelf

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Language:PythonNOASSERTION15600

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonMIT493400

Visual-CoT

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Language:PythonApache-2.08200

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Language:PythonApache-2.0169200

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.01866900

F-LMM

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

Language:PythonNOASSERTION2900

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.04616300

HPT

HPT - Open Multimodal LLMs from HyperGAI

Language:PythonApache-2.030400