too quiet parts of speech

Question

too quiet parts of speech

Nistrian opened this issue 3 years ago · comments

Hey! One of the Enhance.py arguments, --snr, is not used in your code. At least in the enhance function, I have not found a use for it. Using the default settings, my slightly noisy sound is clearer, however parts of speech are much quieter and speech volume is uneven. I think snr can help with this, right?

Guochen Yu · Answer 1 · Wed Jan 12 2022 09:08:54 GMT+0800 (China Standard Time)

Thanks for your questions. Actually, the "--snr" argument is not used in this code. You can ignore it. Because the testset for VB+DEMAND is not splited by different SNRs. In our other testset, such as WSJ0+DNS-Challenge, we use the snr argument forinference. So here in this code, you can just ignore it.

Nistrian · Answer 2 · Wed Jan 12 2022 21:34:43 GMT+0800 (China Standard Time)

Thanks for the quick response. In this case, could you advise how you can correct too quiet parts of speech? That is, most of the audio is at the same volume, while some parts are muted too much.

Guochen Yu · Answer 3 · Wed Jan 12 2022 21:48:32 GMT+0800 (China Standard Time)

Thanks for your response. Actually, i think you could use a Voice Activity Detection (VAD) to tackle the muted parts. I mean when you conduct inference, you could calculate the power of each frame to detect the muted parts and correct these frames. In my research, i didn't implement the manual detection. In fact, i didn't correct the muted parts for this dataset. As for the arguments "--snr", it is just used for different testset. And we don't calculated the snrs to help to enhance the noisy utterances. Following your suggestions, we might calculate segmental snrs to detect the correlations between speech and noise for better enhancment. Thank you.