yiyunchen

Yiyun Chen's starred repositories

VAR-CLIP

Implements VAR+CLIP for image generation

Language:Python6700

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT402400

Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Language:PythonAGPL-3.03682700

ControlNeXt

Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA

Language:PythonApache-2.0128300

VmambaIR

This is official implementtaion of "VmambaIR: Visual State Space Model for Image Restoration"

Language:Python16600

SEA-RAFT

[ECCV2024 Oral] SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow

Language:PythonBSD-3-Clause25000

annotation-tool

Language:JavaScriptMIT13400

FineDiving

FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment

Language:PythonMIT10600

CogVideo

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Language:PythonApache-2.0770900

COCOCO

Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.

Language:Python19400

IDM-VTON-training

IDM-VTON-training : This is an unofficial training code of idm-vton

Language:Python5500

DMT

Deficiency-Aware Masked Transformer for Video Inpainting

Apache-2.05100

FuseFormer

official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

Language:Python11000

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookApache-2.0505100

Sports-QA

Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports

2700

mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Language:PythonApache-2.0419100

SportsHHI

[CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

Language:Python1100

FineSports_CVPR2024

700

MultiSports

[ICCV 2021] MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions

Language:PythonNOASSERTION10700

detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Language:PythonApache-2.03006600

IDM-VTON

[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Language:Python366300

Inpaint-Anything

Inpaint anything using Segment Anything and inpainting models.

Language:Jupyter NotebookApache-2.0637400

decord

An efficient video loader for deep learning with smart shuffling that's super easy to digest

Language:C++Apache-2.0182500

Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Language:PythonMIT435400

TAdaConv

[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.

Language:PythonApache-2.022500

EssentialMC2

EssentialMC2 Video Understanding.

Language:PythonMIT11400

PaddleVideo

Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.

Language:PythonApache-2.0150600

PaddleSports

Language:PythonApache-2.09800

promptbench

A unified evaluation framework for large language models

Language:PythonMIT239100

havenask

Language:C++Apache-2.0157600