刘国友's starred repositories
OutfitAnyone
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Visual-Tracking-Development
Visual Object Tracking
Awesome-Denoise
One-paper-one-short-contribution-summary of all latest image/burst/video Denoising papers with code & citation published in top conference and journal.
ml-mobileclip
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
InterpAny-Clearer
Clearer anytime frame interpolation & Manipulated interpolation of anything
Diff-Foley
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
song-describer-dataset
The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.
MMSum_model
[CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
RTQ-MM2023
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model