ming024 / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About fine-tuning issues.

ltydd opened this issue · comments

commented

I intend to fine-tune my own dataset based on the AISHELL3 model, but my dataset only includes 6 speakers, while AISHELL3 has 218. I encountered an error message of size mismatch when loading the model. Is there anyone who can help solve this problem?

I solved this by removing the speaker_embedding.weight when loading the model.
del ckpt['model']['speaker_emb.weight'] in utils/model.py, line 20

commented

我通过在加载模型时删除 speaker_embedding.weight 来解决此问题。在 utils/model.py 中,第 20 行del ckpt['model']['speaker_emb.weight']

I have already deleted "speaker_emb.weight", but the error still occurs: “RuntimeError: The size of tensor a (218) must match the size of tensor b (6) at non-singleton dimension 0”.

我通过在加载模型时删除 speaker_embedding.weight 来解决此问题。在 utils/model.py 中,第 20 行del ckpt['model']['speaker_emb.weight']

I have already deleted "speaker_emb.weight", but the error still occurs: “RuntimeError: The size of tensor a (218) must match the size of tensor b (6) at non-singleton dimension 0”.

I forgot whether I met this error. Have you tried model.load_state_dict(ckpt["model"], strict=False) ?

我通过在加载模型时删除 speaker_embedding.weight 来解决此问题。在 utils/model.py 中,第 20 行del ckpt['model']['speaker_emb.weight']

I have already deleted "speaker_emb.weight", but the error still occurs: “RuntimeError: The size of tensor a (218) must match the size of tensor b (6) at non-singleton dimension 0”.

I forgot whether I met this error. Have you tried model.load_state_dict(ckpt["model"], strict=False) ?

Hi. I also encountered this issue. Have you already solved it?