Yin Xinlei's starred repositories

audiocaps

🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps

Language:PythonLicense:MITStargazers:139Issues:0Issues:0

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonLicense:MITStargazers:3690Issues:0Issues:0

UTMOS22

UT-Sarulab MOS prediction system using SSL models

Language:PythonLicense:MITStargazers:173Issues:0Issues:0

sigsep-mus-db

Python parser and tools for MUSDB18 Music Separation Dataset

Language:PythonLicense:MITStargazers:161Issues:0Issues:0

WritingAIPaper

Writing AI Conference Papers: A Handbook for Beginners

Stargazers:1112Issues:0Issues:0

tango

A family of diffusion models for text-to-audio generation.

Language:PythonLicense:NOASSERTIONStargazers:1000Issues:0Issues:0

llm-tse

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)

Language:JavaScriptStargazers:32Issues:0Issues:0

audio-retrieval-benchmark

Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".

Language:PythonStargazers:45Issues:0Issues:0

versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Language:PythonLicense:MITStargazers:1115Issues:0Issues:0

audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Language:PythonLicense:MITStargazers:176Issues:0Issues:0

LLM101n

LLM101n: Let's build a Storyteller

Stargazers:29279Issues:0Issues:0
Language:PythonStargazers:132Issues:0Issues:0

GESS

Code for GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts

Language:PythonLicense:MITStargazers:13Issues:0Issues:0

Codec-SUPERB

Audio Codec Speech processing Universal PERformance Benchmark

Language:PythonStargazers:205Issues:0Issues:0

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:PythonStargazers:575Issues:0Issues:0

mtg-jamendo-dataset

Metadata, scripts and baselines for the MTG-Jamendo dataset

Language:PythonLicense:Apache-2.0Stargazers:267Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:6110Issues:0Issues:0

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonLicense:MITStargazers:1596Issues:0Issues:0

EnCLAP

Official Implementation of EnCLAP (ICASSP 2024)

Language:PythonLicense:MITStargazers:88Issues:0Issues:0

VGGSound

VGGSound: A Large-scale Audio-Visual Dataset

Language:PythonLicense:NOASSERTIONStargazers:287Issues:0Issues:0

Zero_Shot_Audio_Source_Separation

The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022

Language:PythonLicense:MITStargazers:186Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonLicense:Apache-2.0Stargazers:25487Issues:0Issues:0

ustcthesis

LaTeX template for USTC thesis

Language:TeXLicense:LPPL-1.3cStargazers:1614Issues:0Issues:0

ACT

Source code for the paper 'Audio Captioning Transformer'

Language:Jupyter NotebookStargazers:48Issues:0Issues:0

AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.

Language:PythonLicense:MITStargazers:197Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19784Issues:0Issues:0

melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Language:PythonLicense:MITStargazers:964Issues:0Issues:0

WavCraft

Official repo for WavCraft, an AI agent for audio creation and editing

Language:PythonLicense:NOASSERTIONStargazers:650Issues:0Issues:0

visqol

Perceptual Quality Estimator for speech and audio

Language:C++License:Apache-2.0Stargazers:686Issues:0Issues:0

AudioLDM2

Text-to-Audio/Music Generation

Language:PythonLicense:NOASSERTIONStargazers:2263Issues:0Issues:0