poor alignment when synthesizing long sentences

Question

poor alignment when synthesizing long sentences

moonnee opened this issue 6 years ago · comments

Thank you for your work! It helps a lot.
I want to ask whether your alignment is good when synthesizing sentences more than 10 words, like about 20 words. The paper said 'the model fails when conditioned on the shorter source phrases, successfully aligns when conditioned on the longest input.' The reference audio I used are about 20 words, but only when synthesizing shorter sentences, it works well. Attached please find some samples. Btw, I use nancy and blizzard 2017 for training.
Could you give me some suggestions? Thank you.
samples.zip

Shan Yang · Answer 1 · Thu Sep 27 2018 11:24:24 GMT+0800 (China Standard Time)

Hi, for long sentences, you can try the GMM attention. It works well especially for long sentences.