fatchord / WaveRNN

WaveRNN Vocoder + TTS

Home Page:https://fatchord.github.io/model_outputs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

anyone tried num_mels=180 (or any number >120)?

shuzeZHAO opened this issue · comments

Hi all, just wondering if anyone has tried using num_mels=180 to train WaveRNN. I tried num_mels=120 and it works well. But if I increase it to 180, Tacotron works well but the WaveRNN model can only learn to generate white noise mixed with some human-like muttering (not meaningful words). Here are the key hparams, others are pretty much the default. Please let me know if you have tried this, or if you may have some tips on how to adjust the model. Thanks!

sample_rate = 24000
n_fft = 2048
num_mels = 180
hop_length = 250                  
win_length = 1000              
voc_mode = 'RAW'              
voc_upsample_factors = (5, 5, 10)
mu_law = True 
peak_norm = True

Hi @shuzeZHAO,

have you noticed improvements in audio quality by using 120 vs 80 mels?

Hi @shuzeZHAO,

have you noticed improvements in audio quality by using 120 vs 80 mels?

Hi @alexdemartos,
It's improved for male (low-pitch) voice, and high-pitch voice is about the same I think.