Nils L. Westhausen's starred repositories
NoiseTorch
Real-time microphone noise suppression on Linux.
Soundflower
MacOS system extension that allows applications to pass audio to other applications. Soundflower works on macOS Catalina.
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
svoice
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
AEC-Challenge
AEC Challenge
pyminiaudio
python interface to the miniaudio audio playback, recording, decoding and conversion library
PLC-Challenge
This repo contains required files for the INTERSPEECH 2022 Audio Deep Packet Loss Concealment (PLC) Challenge.
python_kaldi_features
python codes to extract MFCC and FBANK speech features for Kaldi
se_relativisticgan
Keras framework for speech enhancement using relativistic GANs
flopco-keras
FLOPs and other statistics COunter for tf.keras neural networks
MAPS-Scripts
A fundamental frequency estimation algorithm using features from the magnitude and phase spectrogram.
SpotifyDataAnalyzer
Analyzer of User Data saved by Spotify