alisonbma

Alison Bernice Ma's repositories

aiSFX

Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.

Language:PythonCC-BY-4.041 4 2

ai-audio-startups

Community list of startups working with AI in audio and music technology

Apache-2.0100

astronify

Astronomical data sonification.

Language:Python000

audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Language:PythonMIT000

auraloss

Collection of audio-focused loss functions in PyTorch

Language:PythonApache-2.0000

CLAP

Contrastive Language-Audio Pretraining

Language:PythonCC0-1.0000

CLAP-microsoft

Learning audio concepts from natural language supervision

Language:PythonMIT000

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookMIT000

ddsp

DDSP: Differentiable Digital Signal Processing

Language:PythonApache-2.0000

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1/24 kHz mono/stereo audio.

Language:PythonMIT000

jukebox

Code for the paper "Jukebox: A Generative Model for Music"

Language:PythonNOASSERTION000

micro-tcn

Efficient neural networks for analog audio effect modeling

Language:PythonApache-2.0000

MIR_Identifying_Actions_for_Sound_Event_Classification

Language:Jupyter Notebook000

MIR_ismir2018-oss-tutorial

ISMIR2018 Tutorial on Open Source and Reproducibility in MIR Research

Language:Jupyter NotebookMIT000

MIR_openl3

OpenL3: Open-source deep audio and image embeddings

Language:Jupyter NotebookMIT000

ML_embedding-playbook

You want to embed your Tableau content in lots of places. Start here.

Language:CSS000

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT000

pedalboard

🎛 🔊 A Python library for working with audio.

GPL-3.0000

PUBLICATIONS_paperTemplates

Repository for paper templates in ISMIR Proceedings

Language:TeX000

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION000

Resemblyzer

A python package to analyze and compare voices with deep learning

Language:PythonApache-2.0000

riffusion

Stable diffusion for real-time music generation

Language:PythonMIT000

riffusion-app

Stable diffusion for real-time music generation (web app)

Language:TypeScriptMIT000

sample-generator

Tools to train a generative model on arbitrary audio samples

Language:Jupyter NotebookMIT000

SoundStream

This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf

000

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonMIT000

tiny-audio-diffusion

A repository for generating and training short audio samples with unconditional waveform diffusion on accessible consumer hardware (<2GB VRAM GPU)

Language:PythonMIT000

visqol

Perceptual Quality Estimator for speech and audio

Apache-2.0000

VoiceLab

Automated Reproducible Acoustical Analysis

000

WaveRNN

WaveRNN Vocoder + TTS

MIT000