facebookresearch / denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is there any non-stationary noise in interspeech2020/master?

st3inum opened this issue · comments

I was trying to reproduce the results of the DNS dataset interspeech2020/master dataset(dataset is generated with the default configuration). But it seems that in the case of non-stationary data it works really bad, although the author's code/weight works correctly. I suspect that there is any non-stationary noise in interspeech2020 or not. Or am I doing something wrong in the configuration file? Should I try something else to get better results with non-stationary data?

there are many noise types in DNS, taken from audioset. I'm not sure exactly what you mean by stationary. for one example, the noise type is always the same, however, the noise type can be fairly complex. Did you listen to the generated samples using the DNS script ? then you will see if the noise type match what you expect.

Yeah, I listened to some of the samples generated by the DNS script. But didn't find my expected noise although found some good quality stationary noise.

By non-stationary noise, I meant to sound like hammer hitting noise

Hi @st3inum,
Yes, there are plenty of non-stationary noises in DNS. You have hammering noise as you shared, babies crying, dogs barking, etc.