AndreJJXu's starred repositories
Diff-Foley
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
T2I-CompBench
[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation
audio-dataset
Audio Dataset for training CLAP and other models
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
SceneWiz3D
[CVPR 2024] SceneWiz3D: Towards Text-guided 3D Scene Composition
ControlNet-v1-1-nightly
Nightly release of ControlNet 1.1
ControlNet
Let us control diffusion models!
blended-latent-diffusion
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
SyncDiffusion
Official implementation of SyncDiffusion.
MultiDiffusion
Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)
ModalBiasAVSR
Offical implementation of the CVPR 2024 paper: A Study of Dropout-Induced Modality Bias on Robustness to Missing Video.
clotho-dataset
Python code for handling the Clotho dataset.