Chenxi's repositories
insanely-fast-whisper
Incredibly fast Whisper-large-v3
Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
VideoCrafter
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
DiffMorpher
Official Code for DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing
PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
chenxwh.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Smooth-Diffusion
[CVPR 2024] Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
StoryDiffusion
Create Magic Story!
Depth-Anything
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Phased-Consistency-Model
Boosting the performance of consistency models with PCM!