AMEERAZAM08

Ameer Azam's starred repositories

IOPaint

Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.

Language:PythonApache-2.020085 144 473

hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Language:PythonMIT9663 654 157

DiffSynth-Studio

Enjoy the magic of Diffusion models!

Language:PythonApache-2.06699 60 167

SwinIR

SwinIR: Image Restoration Using Swin Transformer (official repository)

Language:PythonApache-2.04561 53 154

stable-audio-tools

Generative models for conditional audio generation

Language:PythonMIT2813 42 103

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonNOASSERTION2364 42 69

InstantStyle

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥

Language:Jupyter Notebook1559 21 36

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonMIT1434 21 71

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).

Language:PythonNOASSERTION1054 12 88

MetaPortrait

[CVPR 2023] MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

Language:PythonMIT539 63 33

FoleyCrafter

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝

Language:PythonApache-2.0499 15 22

Phased-Consistency-Model

[NeurIPS 2024] Boosting the performance of consistency models with PCM!

Language:PythonApache-2.0405 19 21

TalkingGaussian

[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

Language:Python292 17 63

swap-anything

Official implementation of the ECCV paper "SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"

Language:PythonMIT236 28 7

Diffree

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Language:PythonApache-2.0232 6 13

3DitScene

3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

Language:Python193 6 21

MultiTalk

[INTERSPEECH'24] Official repository for "MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset"

Language:Python79 5 9

FADM

Language:Python72 2 8

StableAudioWebUI

A Lightweight Gradio Web interface for Text-to-Audio Generation utilising SAO1.0

Language:PythonApache-2.047 4 1

Noise-free-Optimization-in-Early-Training-Steps-for-Image-Super-Resolution

[AAAI2024] Official Repository for Noise-free Optimization in Early Training Steps for Image Super-Resolution

Language:Python41 6 3

mindiffusion

Repository of lessons exploring image diffusion models, focused on understanding and education.

Language:PythonMIT3600

anywhere-multi-agent

Language:Jupyter Notebook35 4 3

LipSync3D

Language:Jupyter Notebook3200

CharacterGen

[SIGGRAPH'24] CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization

Language:JavaScriptAGPL-3.09 10

SPEAK-hack

Using Claude Sonnet to reverse engineer paper Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation

Language:Python7 4 2

U2Net-better

Language:PythonMIT6 10

anywhere-multi-agent

Language:Jupyter Notebook100

mmagic

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, image/video restoration/enhancement, etc.

Language:Jupyter NotebookApache-2.0100

Upscale-A-Video

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

100

e4s

(CVPR 2023) E4S: Fine-grained Face Swapping via Regional GAN Inversion

MIT100