fatchord / WaveRNN

WaveRNN Vocoder + TTS

Home Page:https://fatchord.github.io/model_outputs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Attention Blank, is it because my progressive training schedule?

lilmadman007 opened this issue · comments

my attention is empty after 10k steps, which shouldn't be normal.
I'm using LJSpeech dataset.
This is the second time I preprocessed everything and trained.

image

Loss is around 1.0 at 10k steps
Are my settings wrong here? Does this not work?

image

Thanks!

NOTE: I LOOKED AT THIS ISSUE ALREADY -> #154

Hi, sometimes the alignment will fail randomly. I've never tried with batch size of 8 so that could be it. Maybe try finetuning on one of the pretrained models.

did you ever solve this?

did you ever solve this?

Sorry for the lack of feedback. No I did not. when fatchord commented that it fails sometimes I tried it again 2 more times, but it
just didn't work. Maybe my gpu is just not good enough, like I said, but I just moved on when I couldn't get results.
Any help would be appreciated anyways!

I think i found a solution,

  1. increase "r" from 7 to 12 in the tts_schedule in hparams.py.
  2. go to models file>tacotron.py, and change line 200 from "scores = torch.sigmoid(u) / torch.sigmoid(u).sum(dim=1, keepdim=True)" to "scores = F.softmax(u, dim=1)".

got this from #154 (comment)