Beast code in Giters

ZWCui's starred repositories

MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Language:PythonNOASSERTION212200

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonApache-2.0235100

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonMIT466900

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.03603400

llama

Inference code for Llama models

Language:PythonNOASSERTION5490600

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter Notebook1117800

vstar

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Language:PythonMIT48600

algorithm-visualizer

:fireworks:Interactive Online Platform that Visualizes Algorithms from Code

Language:JavaScriptMIT4648600

shikra

Language:PythonNOASSERTION71600

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT1932300

recognize-anything

Open-source and strong foundation image recognition models.

Language:Jupyter NotebookApache-2.0265900

fMRI-reconstruction-NSD

fMRI-to-image reconstruction on the NSD dataset.

Language:Jupyter NotebookMIT28800

lvis-api

Python API for LVIS Dataset

Language:PythonNOASSERTION40200

LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.

Language:PythonNOASSERTION136700

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookBSD-3-Clause934200

paco

This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts, and visualization notebooks.

Language:PythonMIT26300

MedSAM

Segment Anything in Medical Images

Language:Jupyter NotebookApache-2.0254300

VLPart

[ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation

Language:PythonMIT34300

OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22

Language:PythonMIT40400

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookApache-2.01443800

grounded-segment-any-parts

Grounded Segment Anything: From Objects to Parts

Language:Jupyter NotebookNOASSERTION37800

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.01857100

RelateAnything

Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.

Language:PythonApache-2.043800

XrayGLM

🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.

Language:PythonNOASSERTION86100

MedCLIP

EMNLP'22 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts

Language:Python40500

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonBSD-3-Clause2520900

VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Language:PythonApache-2.0405100

XrayGPT

[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

Language:Python45000

againcui