ferugit

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonNOASSERTION010

quark

Efficient Keyword Spotting

Language:Python010

sonopytorch

Torch implementation of Sonopy

Language:Python020

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.0010

transformer-corrector

Transformer-based Spanish corrector

Language:PythonMIT010

DESED_task

Domestic environment sound event detection task

Language:Jupyter Notebook010

diart

Lightweight python library for streaming speaker diarization in real-time implemented in pytorch

Language:PythonMIT000

EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

MIT000