AndreJJXu's starred repositories

blended-latent-diffusion

Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]

Language:Jupyter NotebookLicense:MITStargazers:566Issues:0Issues:0

SyncDiffusion

Official implementation of SyncDiffusion.

Language:Jupyter NotebookLicense:MITStargazers:152Issues:0Issues:0

MultiDiffusion

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Language:Jupyter NotebookStargazers:994Issues:0Issues:0

ModalBiasAVSR

Offical implementation of the CVPR 2024 paper: A Study of Dropout-Induced Modality Bias on Robustness to Missing Video.

Stargazers:8Issues:0Issues:0

clotho-dataset

Python code for handling the Clotho dataset.

Language:PythonLicense:NOASSERTIONStargazers:76Issues:0Issues:0

ClipClap-GZSL

Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models

Language:PythonLicense:MITStargazers:12Issues:0Issues:0

AudioCLIP

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Language:PythonLicense:MITStargazers:767Issues:0Issues:0

lyrebird-wav2clip

Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Language:PythonLicense:MITStargazers:325Issues:0Issues:0

PerceptualSimilarity

LPIPS metric. pip install lpips

Language:PythonLicense:BSD-2-ClauseStargazers:3674Issues:0Issues:0
Stargazers:3Issues:0Issues:0

Generating-Realistic-Images-from-In-the-wild-Sounds

Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023

Language:Jupyter NotebookStargazers:10Issues:0Issues:0

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonLicense:MITStargazers:4364Issues:0Issues:0

UniS-MMC

Code for UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning (ACL 2023)

Language:PythonLicense:MITStargazers:32Issues:0Issues:0
Language:PythonLicense:MITStargazers:31Issues:0Issues:0

GALIP

[CVPR2023] A faster, smaller, and better text-to-image model for large-scale training

Language:PythonLicense:MITStargazers:228Issues:0Issues:0
Language:PythonStargazers:28Issues:0Issues:0

TPoS

This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)

Language:PythonStargazers:20Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:6287Issues:0Issues:0

Shifted_Diffusion

Code for Shifted Diffusion for Text-to-image Generation (CVPR 2023)

Language:PythonLicense:CC0-1.0Stargazers:160Issues:0Issues:0

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:689Issues:0Issues:0