anyone tried num_mels=180 (or any number >120)?

Question

anyone tried num_mels=180 (or any number >120)?

shuzeZHAO opened this issue 5 years ago · comments

Hi all, just wondering if anyone has tried using num_mels=180 to train WaveRNN. I tried num_mels=120 and it works well. But if I increase it to 180, Tacotron works well but the WaveRNN model can only learn to generate white noise mixed with some human-like muttering (not meaningful words). Here are the key hparams, others are pretty much the default. Please let me know if you have tried this, or if you may have some tips on how to adjust the model. Thanks!

sample_rate = 24000
n_fft = 2048
num_mels = 180
hop_length = 250                  
win_length = 1000              
voc_mode = 'RAW'              
voc_upsample_factors = (5, 5, 10)
mu_law = True 
peak_norm = True

Alejandro Pérez González de Martos · Answer 1 · Tue Oct 15 2019 19:21:44 GMT+0800 (China Standard Time)

Hi @shuzeZHAO,

have you noticed improvements in audio quality by using 120 vs 80 mels?

S.Zhao · Answer 2 · Tue Oct 15 2019 21:23:46 GMT+0800 (China Standard Time)

Hi @shuzeZHAO,

have you noticed improvements in audio quality by using 120 vs 80 mels?

Hi @alexdemartos,
It's improved for male (low-pitch) voice, and high-pitch voice is about the same I think.