Yuzhong Zhao's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:46077Issues:304Issues:658

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18578Issues:159Issues:1431

Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

Language:HTMLLicense:MITStargazers:10561Issues:265Issues:45

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:10012Issues:65Issues:105
Language:PythonLicense:NOASSERTIONStargazers:8248Issues:154Issues:0

fiftyone

The open-source tool for building high-quality datasets and computer vision models

Language:PythonLicense:Apache-2.0Stargazers:7971Issues:56Issues:1487

AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Language:PythonLicense:MITStargazers:4754Issues:60Issues:79

recognize-anything

Open-source and strong foundation image recognition models.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2659Issues:26Issues:154

InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Language:PythonLicense:MITStargazers:2439Issues:34Issues:260

Semantic-SAM

[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"

densecap

Dense image captioning in Torch

Language:Jupyter NotebookLicense:MITStargazers:1575Issues:68Issues:89

DDNM

[ICLR 2023 Oral] Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

Language:PythonLicense:MITStargazers:1099Issues:27Issues:72

ICCV-2023-Papers

ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!

Language:PythonLicense:MITStargazers:906Issues:13Issues:10

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:611Issues:11Issues:50

Awesome-Referring-Image-Segmentation

:books: A collection of papers about Referring Image Segmentation.

SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Language:PythonLicense:NOASSERTIONStargazers:541Issues:14Issues:48

DatasetDM

[NeurIPS2023] DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Models

VLDet

[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)

Language:PythonLicense:NOASSERTIONStargazers:177Issues:5Issues:17

Prompt-Can-Anything

You can do anything by sota AI with prompt ,auto AI tools , VL larger model fine and project

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:175Issues:7Issues:1

ptp

[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》

Language:PythonLicense:Apache-2.0Stargazers:147Issues:7Issues:10

PartImageNet

Introduction and scripts for the paper "PartImageNet: A Large, High-Quality Dataset of Parts" (Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan, Jie-Neng Chen, Shuai Liu, Cheng Yang, Alan Yuille).

Tube-Link

[ICCV-2023]-Universal Video Segmentaion For VSS, VPS and VIS

PhraseCutDataset

Dataset API for "PhraseCut: Language-based Image Segmentation in the Wild"

Language:Jupyter NotebookStargazers:100Issues:7Issues:4

ETRIS

[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

Language:PythonLicense:MITStargazers:90Issues:3Issues:14

WSOD-Paper-List

A paper list of state-of-the-art weakly supervised object detection or localization.

awesome-weakly-supervised-object-localization

Awesome weakly-supervised object localization, paper list, performance list

License:GPL-3.0Stargazers:72Issues:7Issues:0

GVT

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".

Language:PythonLicense:Apache-2.0Stargazers:54Issues:7Issues:8

ovad-benchmark-code

OVAD: Open-vocabulary Attribute Detection code

Language:PythonLicense:CC0-1.0Stargazers:28Issues:2Issues:5