Plachtaa / FAcodec

Training code for FAcodec presented in NaturalSpeech3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What do the loss curves look like during your successful training?

YuXiangLin1234 opened this issue · comments

Hello,

I've attempted to train FAcodec using my own dataset. However, whether I start from scratch or fine-tune your provided checkpoint, the reconstructed audio clips are just noise. I fine-tuned the model using around 128 hours of Common Voice 18 ZH-TW data. After approximately 20k steps, the loss seemed to converge. Some losses, like feature loss, decreased successfully, while others, such as mel loss and waveform loss, were oscillating.

Do all losses decrease during your training process?

Could you please share your voice examples and loss curves? I believe they can help for analyzing the issue you encountered

According to the mel_loss in the loss curve you shared, the model seems to have converged well.
However, the reconstructed audio samples sounds to be generated by a randomly initialized model.
May I know whether the reconstructed sample is retrieved from tensorboard or through another reconstruction script?