melodyless's repositories
arranger
An AI for Automatic Instrumentation
audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
automatic_melody_harmonization
melody harmoniztion using orderless NADE, chord balancing and blocked Gibbs sampling
Catch-A-Waveform
Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
charsiu
Charsiu: A neural phonetic aligner.
clpcnet
Pitch-shifting, time-stretching, and vocoding of speech with Controllable LPCNet (CLPCNet)
DeepAFx-ST
DeepAFx-ST - Style transfer of audio effects with differentiable signal processing. Please see https://csteinmetz1.github.io/DeepAFx-ST/
deepperformer
Deep Performer: Score-to-audio music performance synthesis
denoising-historical-recordings
A two-stage U-Net for high-fidelity denoising of historical recordings
DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
e2e_lfmmi
E2E system with LF-MMI; word N-gram for Mandarin
FullSubNet-plus
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
mctx
Monte Carlo tree search in JAX
MelSpecVAE
Variational Autoencoder in the mel-spectrogram domain for one-shot audio synthesis
MidiTok
A convenient MIDI tokenizer for Deep Learning networks, with multiple encoding strategies
MuseMorphose
PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.
Neural-HMM
Neural HMMs are all you need (for high-quality attention-free TTS)
RapidASR
A Cross platform implementation of Wenet ASR inference. It's based on ONNXRuntime and Wenet. We provide a set of easier APIs to call wenet models.
RAVE
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
RAVE-audition
VST/AU Plugin for Auditioning RAVE Models in Real-time
rVAD
Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
steerable-nafx
Steerable discovery of neural audio effects
Supervised-Learning-for-Multi-Zone-Sound-Field-Reproduction-under-Harsh-Environmental-Conditions
This repository provides the source code that was used to create the data for the paper "Supervised Learning for Multi Zone Sound Field Reproduction under Realistic Conditions".
wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit