yhzhouowo's starred repositories
CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
ProPainter
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
ISAT_with_segment_anything
Labeling tool with SAM(segment anything model),supports SAM, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
VideoMamba
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
multimodal-prompt-learning
[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
VisionMamba
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images
Awesome-Human-Activity-Recognition
An up-to-date & curated list of Awesome IMU-based Human Activity Recognition(Ubiquitous Computing) papers, methods & resources. Please note that most of the collections of researches are mainly based on IMU data.
awesome-self-supervised-multimodal-learning
[T-PAMI] A curated list of self-supervised multimodal learning resources.
Recent-Image-Quality-Related-Papers
A list of image quality related papers published in top conferences and journals
class-incremental-learning
PyTorch implementation of a VAE-based generative classifier, as well as other class-incremental learning methods that do not store data (DGR, BI-R, EWC, SI, CWR, CWR+, AR1, the "labels trick", SLDA).
Multimodal-Learning-with-Alternating-Unimodal-Adaptation
Multimodal Learning Method MLA for CVPR 2024
ICLR2024-REDL
[ICLR 2024 Spotlight] R-EDL: Relaxing Nonessential Settings of Evidential Deep Learning