yuguochencuc / DB-AIAT

The implementation of "Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

too quiet parts of speech

Nistrian opened this issue · comments

Hey! One of the Enhance.py arguments, --snr, is not used in your code. At least in the enhance function, I have not found a use for it. Using the default settings, my slightly noisy sound is clearer, however parts of speech are much quieter and speech volume is uneven. I think snr can help with this, right?

Thanks for your questions. Actually, the "--snr" argument is not used in this code. You can ignore it. Because the testset for VB+DEMAND is not splited by different SNRs. In our other testset, such as WSJ0+DNS-Challenge, we use the snr argument forinference. So here in this code, you can just ignore it.

Thanks for the quick response. In this case, could you advise how you can correct too quiet parts of speech? That is, most of the audio is at the same volume, while some parts are muted too much.

Thanks for your response. Actually, i think you could use a Voice Activity Detection (VAD) to tackle the muted parts. I mean when you conduct inference, you could calculate the power of each frame to detect the muted parts and correct these frames. In my research, i didn't implement the manual detection. In fact, i didn't correct the muted parts for this dataset. As for the arguments "--snr", it is just used for different testset. And we don't calculated the snrs to help to enhance the noisy utterances. Following your suggestions, we might calculate segmental snrs to detect the correlations between speech and noise for better enhancment. Thank you.