Robotic noise on audio

Question

Likkkez opened this issue a year ago · comments

I'm trying to fine-tune the 24kHZ model but it seems that the more I train it the worse it becomes. Here's how the audio sounds now:

out.mp4

I have a custom dataset with a few speakers. Any idea how to get better results? Would adding in the original VCTK dataset to the mix make it better?

Jing-Yi Li · Answer 1 · Mon Feb 13 2023 23:21:06 GMT+0800 (China Standard Time)

maybe the custom dataset is too small and so leads to overfitting. mix with adding original vctk data may help.

Likkkez · Answer 2 · Mon Feb 13 2023 23:36:39 GMT+0800 (China Standard Time)

maybe the custom dataset is too small and so leads to overfitting. mix with adding original vctk data may help.

Should I restart finetuning after i add VCTK or just continue from this checkpoint?

Jing-Yi Li · Answer 3 · Tue Feb 14 2023 23:41:05 GMT+0800 (China Standard Time)

i would prefer restart after add vctk

Likkkez · Answer 4 · Wed Feb 15 2023 20:38:10 GMT+0800 (China Standard Time)

i would prefer restart after add vctk

I added the vctk and trained it more from an earlier checkpoint. I think it cleared up pretty nicely! Thanks for all the help!

out.mp4