Adam's starred repositories
OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
invisible-watermark
python library for invisible image watermark (blind image watermark)
rq-scheduler
A lightweight library that adds job scheduling capabilities to RQ (Redis Queue)
Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Real3DPortrait
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
AnimateLCM
AnimateLCM: Let's Accelerate the Video Generation within 4 Steps!
frechet-audio-distance
A lightweight library for Frechet Audio Distance calculation.
stable-audio-metrics
Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.
Diffstyler
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization
MORPHEUS-1
Implementation of "MORPHEUS-1" from Prophetic AI and "The world’s first multi-modal generative ultrasonic transformer designed to induce and stabilize lucid dreams. "
Self-Cascade
[ECCV2024] Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
music-text-representation-pp
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR++) [ICASSP24]
invisible-watermark
python library for invisible image watermark (blind image watermark)