Ameer Azam's repositories
ConsistentID
Customized ID Consistent for human
3DitScene
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
APISR
APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)
CVPR-2023-24-Papers
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Generative_Deep_Learning_2nd_Edition
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
GPAvatar
[ICLR 2024] Generalizable and Precise Head Avatar from Image(s)
lightning_track
[ICLR 2024] Generalizable and Precise Head Avatar from Image(s)
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
MimicMotion
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
mindiffusion
Repository of lessons exploring image diffusion models, focused on understanding and education.
MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Parts2Whole
[Arxiv 2024] From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
Phased-Consistency-Model
Boosting the performance of consistency models with PCM!
stable-audio-tools
Generative models for conditional audio generation
SwinIR
SwinIR: Image Restoration Using Swin Transformer (official repository)
VASA-1-hack
Using Claude Opus to reverse engineer code from VASA white paper - WIP - (this is for La Raza 🎷)
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
VOODOO3D-official
Official implementation for the paper "VOODOO 3D: Volumetric Portrait Disentanglement for One-Shot 3D Head Reenactment"