dbralios

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"

Language:PythonMIT2300

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonApache-2.0176300

Cacophony

Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986

Language:PythonMIT2100

Awesome-instruction-tuning

A curated list of awesome instruction tuning datasets, models, papers and repositories.

Language:PythonApache-2.025400

Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

41100

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT403900

aac-datasets

Audio Captioning datasets for PyTorch.

Language:PythonMIT8900

julius

Fast PyTorch based DSP for audio and 1D signals

Language:PythonMIT40700

improved-diffusion

Release for Improved Denoising Diffusion Probabilistic Models

Language:PythonMIT289900

python-audio-effects

Apply audio effects such as reverb and EQ directly to audio files or NumPy ndarrays.

Language:PythonMIT38000

fma

FMA: A Dataset For Music Analysis

Language:Jupyter NotebookMIT215400

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

NOASSERTION2542400

audiolazy

Expressive Digital Signal Processing (DSP) package for Python

Language:PythonGPL-3.068300

computersandmusic

Notebooks for the EPFL class "Computers and Music".

Language:Jupyter Notebook2000

easyeffects

Limiter, compressor, convolver, equalizer and auto volume and many other plugins for PipeWire applications

Language:C++GPL-3.0603400

ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Language:Python30600

stable-audio-tools

Generative models for conditional audio generation

Language:PythonMIT150000

ddc_onset

Music onset detector from Dance Dance Convolution packaged as a lightweight PyTorch module

Language:PythonMIT3000

Pengi

An Audio Language model for Audio Tasks

Language:PythonMIT25400

heterogeneous_separation

Code and data recipes for the paper: Heterogeneous Target Speech Separation

Language:PythonMIT3800

MusicVAE

Language:Jupyter Notebook4700

optimal_condition_training

Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris Smaragdis and Jonathan Le Roux

Language:PythonMIT1200

musiclm-pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

Language:PythonMIT306100