zhaoforever's repositories
bitsandbytes
Library for 8-bit optimizers and quantization routines.
bssaec2020
A New Perspective of Auxiliary-Function-Based Independent Component Analysis in Acoustic Echo Cancellation
EasyComDataset
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.
flops-counter.pytorch
Flops counter for convolutional networks in pytorch framework
HRTF-construction
code for HRTF database construction
model-compression
model compression based on pytorch (1、quantization: 16/8/4/2 bits(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary value(twn/bnn/xnor-net);2、 pruning: normal、regular and group convolutional channel pruning;3、 group convolution structure;4、batch-normalization folding for quantization)
pedalboard
A Python library for adding effects to audio.
PseudoBinaural_CVPR2021
Codebase for the paper "Visually Informed Binaural Audio Generation without Binaural Audios" (CVPR 2021)
python-pesq-1
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
RIR-Generator
Generating room impulse responses
room-impulse-responses
A list of publicly available room impulse response datasets and scripts to download them.
SepStereo_ECCV2020
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
sofamyroom
Room acoustic simulator with a SOFA file loader.
speechbrain
A PyTorch-based Speech Toolkit
speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Subband-Music-Separation
Pytorch: Channel-wise subband input for better voice and accompaniment separation
svoice
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
unified2021
A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION
voicefixer
General Speech Restoration