facebookresearch / denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Denoise an audio array (ndarray) instead of an audio file (mp3)

d4rkc0de opened this issue · comments

I have an audio array of type ndarray that I want to denoise, in the example notebooks it uses a mp3 file:

wav, sr = torchaudio.load('alex_noisy.mp3')
Can I do the same with an audio array ?

I tried:

audio_array = generate_audio(text_prompt)
wav = torch.from_numpy(audio_array)
wav = convert_audio(wav.cuda(), SAMPLE_RATE, model.sample_rate, model.chin)

but I got this error:
AttributeError: 'Tensor' object has no attribute 'channels'