1zeryu

YuKun Zhou's starred repositories

InfoGrowth

Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data

Language:Python1600

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

NC-SDEdit

[ECCV 2024] Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

7200

MotionLLM

[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Language:PythonNOASSERTION20000

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2042300

xducs

西电计科课程源码分享

Language:Jupyter Notebook2000

ERNIE-Layout-Pytorch

An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.

Language:PythonMIT9500

Latte

Latte: Latent Diffusion Transformer for Video Generation.

Language:PythonApache-2.0157000

MotionLCM

[ ECCV 2024 ] MotionLCM: This repo is the official implementation of "MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model"

Language:PythonNOASSERTION20000

LiGO

[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim

Language:PythonMIT8000

PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Language:PythonAGPL-3.0154800

consistency_models

Official repo for consistency models.

Language:PythonMIT604900

ControlNet

Let us control diffusion models!

Language:PythonApache-2.02940500

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonMIT51400

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonApache-2.0123700

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT392500