Problem about using denoiser as preprocess

Question

Problem about using denoiser as preprocess

lfgogogo opened this issue 2 years ago · comments

Hi,thanks for the great job,denoiser can really denoise.
But when i use it as the preprocess of asr ,the result is worse,i use denoiser as bellow:

from denoiser.demucs import Demucs
import torch
import time
import torchaudio
from scipy.io.wavfile import write

ROOT = "https://dl.fbaipublicfiles.com/adiyoss/denoiser/"
DNS_48_URL = ROOT + "dns48-11decc9d8e3f0998.th"
DNS_64_URL = ROOT + "dns64-a7761ff99a7d5bb6.th"
MASTER_64_URL = ROOT + "master64-8a5dfb4bb92753dd.th"

def _demucs(pretrained, url, **kwargs):
    model = Demucs(**kwargs, sample_rate=16_000)
    if pretrained:
        state_dict = torch.hub.load_state_dict_from_url(url, map_location='cpu')
        model.load_state_dict(state_dict)
    return model

def dns48(pretrained=True):
    return _demucs(pretrained, DNS_48_URL, hidden=48)

def dns64(pretrained=True):
    return _demucs(pretrained, DNS_64_URL, hidden=64)

def master64(pretrained=True):
    return _demucs(pretrained, MASTER_64_URL, hidden=64)

if __name__=='__main__':
    model = master64().cuda().eval()
    if model is None:
        print('model is none')
    x,_=torchaudio.load(r'noise.wav')
    out=model(x.cuda())
    #ASR
    write(r'denoise.wav', 16000,out[0][0].cpu().detach().numpy())

Do i miss any important process?Or someone has a similar problem as me?Hope for answer.

Alexandre Défossez · Answer 1 · Mon Feb 28 2022 17:32:52 GMT+0800 (China Standard Time)

can you make sure that the sample rate of your wav files are indeed sampled at 16000 Hz ? that could be one issue.

lfgogogo · Answer 2 · Mon Feb 28 2022 17:42:20 GMT+0800 (China Standard Time)

Of course，they are 16000 Hz.

Alexandre Défossez · Answer 3 · Wed Mar 23 2022 22:39:31 GMT+0800 (China Standard Time)

if the noise level is not super high, it is possible that the artefacts from the separation are actually hurting the ASR performance. for noisy samples though i would expect it to work. also it depends if your ASR model was trained on noisy data or not.