Beast code in Giters

AndreJJXu's starred repositories

blended-latent-diffusion

Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]

Language:Jupyter NotebookMIT56600

SyncDiffusion

Official implementation of SyncDiffusion.

Language:Jupyter NotebookMIT15200

MultiDiffusion

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Language:Jupyter Notebook99400

ModalBiasAVSR

Offical implementation of the CVPR 2024 paper: A Study of Dropout-Induced Modality Bias on Robustness to Missing Video.

800

clotho-dataset

Python code for handling the Clotho dataset.

Language:PythonNOASSERTION7600

ClipClap-GZSL

Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models

Language:PythonMIT1200

AudioCLIP

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Language:PythonMIT76700

lyrebird-wav2clip

Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Language:PythonMIT32500

PerceptualSimilarity

LPIPS metric. pip install lpips

Language:PythonBSD-2-Clause367400

FPD

300

Generating-Realistic-Images-from-In-the-wild-Sounds

Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023

Language:Jupyter Notebook1000

latent-consistency-model

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Language:PythonMIT436400

UniS-MMC

Code for UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning (ACL 2023)

Language:PythonMIT3200

PSL-MOBO

Language:PythonMIT3100

GALIP

[CVPR2023] A faster, smaller, and better text-to-image model for large-scale training

Language:PythonMIT22800

Sound2Scene

Language:Python2800

TPoS

This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)

Language:Python2000

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonNOASSERTION628700

Shifted_Diffusion

Code for Shifted Diffusion for Text-to-image Generation (CVPR 2023)

Language:PythonCC0-1.016000

AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Language:Jupyter NotebookApache-2.068900