LSimon95 / megatts2

Unoffical implementation of Megatts2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Different with paper

LSimon95 opened this issue · comments

Implementation is different with paper in some modules for losing details in the early paper. And I will modify the code following the new information and train on a larger dataset. Some differences are shown below.

  • Mel-encoder in MRTE is conv stack like prosody-encoder in the paper, but attention modules in my code.
  • Different structure of conv stack in prosody-encoder with lack of max pooling layer, but I think it has no significant impact on the result of training on the small and high-quality dataset.

@LSimon95 Do you know the value of n_heads in MHA in MRTE module used by Bytedance? I read the paper, but did not find related declare. I see you use 2?

@Liujingxiu23 No. I can't find the exact value. Same as the content encoder for convenience.