xyzjin

xyzjin

Geek Repo

Github PK Tool:Github PK Tool

xyzjin's starred repositories

Language:PythonLicense:Apache-2.0Stargazers:92Issues:0Issues:0

ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Language:PythonLicense:GPL-3.0Stargazers:45784Issues:0Issues:0

anno-free-AVS

Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"

Language:PythonStargazers:21Issues:0Issues:0

GiT

🔥 [ECCV2024] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"

Language:PythonLicense:Apache-2.0Stargazers:262Issues:0Issues:0

all-seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Language:PythonStargazers:433Issues:0Issues:0

awesome-open-gpt

Collection of Open Source Projects Related to GPT,GPT相关开源项目合集🚀、精选🔥🔥

Language:PythonStargazers:5420Issues:0Issues:0

Awesome-ChatGPT

🤖 Awesome ChatGPT 中文全指南 🤖这是一个ChatGPT相关的持续更新知识库。如果你对该领域保持着兴趣欢迎关注并运用该知识库!

License:MITStargazers:330Issues:0Issues:0

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonLicense:Apache-2.0Stargazers:2365Issues:0Issues:0

mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

Language:PythonLicense:Apache-2.0Stargazers:1120Issues:0Issues:0

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonLicense:Apache-2.0Stargazers:3551Issues:0Issues:0

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Language:PythonStargazers:719Issues:0Issues:0

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1941Issues:0Issues:0

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonLicense:Apache-2.0Stargazers:7498Issues:0Issues:0

CLIPSelf

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Language:PythonLicense:NOASSERTIONStargazers:156Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4934Issues:0Issues:0

Visual-CoT

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Language:PythonLicense:Apache-2.0Stargazers:82Issues:0Issues:0

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Language:PythonLicense:Apache-2.0Stargazers:1692Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18669Issues:0Issues:0

F-LMM

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

Language:PythonLicense:NOASSERTIONStargazers:29Issues:0Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:46163Issues:0Issues:0

HPT

HPT - Open Multimodal LLMs from HyperGAI

Language:PythonLicense:Apache-2.0Stargazers:304Issues:0Issues:0