Different with paper

Question

Different with paper

LSimon95 opened this issue 9 months ago · comments

Implementation is different with paper in some modules for losing details in the early paper. And I will modify the code following the new information and train on a larger dataset. Some differences are shown below.

Mel-encoder in MRTE is conv stack like prosody-encoder in the paper, but attention modules in my code.
Different structure of conv stack in prosody-encoder with lack of max pooling layer, but I think it has no significant impact on the result of training on the small and high-quality dataset.

Liujingxiu23 · Answer 1 · Mon Jan 08 2024 15:18:04 GMT+0800 (China Standard Time)

@LSimon95 Do you know the value of n_heads in MHA in MRTE module used by Bytedance? I read the paper, but did not find related declare. I see you use 2?

Simon · Answer 2 · Fri Jan 12 2024 23:08:13 GMT+0800 (China Standard Time)

@Liujingxiu23 No. I can't find the exact value. Same as the content encoder for convenience.