Siyeol Jung's starred repositories

VmambaIR

This is official implementtaion of "VmambaIR: Visual State Space Model for Image Restoration"

Language:PythonStargazers:161Issues:0Issues:0

SoundStream

This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf

Language:PythonStargazers:334Issues:0Issues:0

MM-Diffusion

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

Language:PythonLicense:MITStargazers:369Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5869Issues:0Issues:0

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Language:PythonLicense:MITStargazers:4576Issues:0Issues:0

LLM-Codec

The open source code for LLM-Codec

Language:PythonStargazers:102Issues:0Issues:0

AudioLDM2

Text-to-Audio/Music Generation

Language:PythonLicense:NOASSERTIONStargazers:2185Issues:0Issues:0

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language:PythonLicense:NOASSERTIONStargazers:2357Issues:0Issues:0

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonLicense:Apache-2.0Stargazers:2707Issues:0Issues:0

Dyadic-Interaction-Modeling

[ECCV 2024] Dyadic Interaction Modeling for Social Behavior Generation

Language:PythonLicense:NOASSERTIONStargazers:14Issues:0Issues:0

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonLicense:Apache-2.0Stargazers:949Issues:0Issues:0

Qinco

Residual Quantization with Implicit Neural Codebooks

Language:PythonLicense:NOASSERTIONStargazers:41Issues:0Issues:0

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonLicense:MITStargazers:2309Issues:0Issues:0

BasicSR

Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.

Language:PythonLicense:Apache-2.0Stargazers:6592Issues:0Issues:0

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:67194Issues:0Issues:0

sqvae

Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)

Language:PythonLicense:Apache-2.0Stargazers:176Issues:0Issues:0

CVQ-VAE

[ICCV 2023] Online Clustered Codebook

Language:PythonLicense:MITStargazers:131Issues:0Issues:0

Awesome-Image-Quality-Assessment

A comprehensive collection of IQA papers

Language:TeXLicense:MITStargazers:875Issues:0Issues:0

MMSI

Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)

Language:PythonLicense:MITStargazers:8Issues:0Issues:0

OmniTokenizer

OmniTokenizer: one model and one weight for image-video joint tokenization.

Language:PythonLicense:MITStargazers:211Issues:0Issues:0
Language:PythonStargazers:14Issues:0Issues:0

IQA-PyTorch

👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...

Language:PythonLicense:NOASSERTIONStargazers:1732Issues:0Issues:0
Language:PythonLicense:MITStargazers:57Issues:0Issues:0

SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"

Language:PythonLicense:NOASSERTIONStargazers:1173Issues:0Issues:0

mixture-of-experts

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

Language:PythonLicense:MITStargazers:587Issues:0Issues:0

mmvae

Multimodal Mixture-of-Experts VAE

Language:PythonLicense:GPL-3.0Stargazers:187Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:14097Issues:0Issues:0

DistgASR

[TPAMI 2022] DistgASR: Disentangling Mechanism for Light Field Angular Super-Resolution

Language:PythonStargazers:29Issues:0Issues:0

Grid-Diffusion-Models-for-Text-to-Video-Generation

Official Code Repository for the paper "Grid Diffusion Models for Text-to-Video Generation", CVPR 2024

Stargazers:9Issues:0Issues:0

Generating-Realistic-Images-from-In-the-wild-Sounds

Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023

Language:Jupyter NotebookStargazers:9Issues:0Issues:0