Able to reproduce Meta's quality?
listener17 opened this issue · comments
Did you try to replicate training of Meta?
Just curious - if it is at all possible to replicate the stuff from the code that was shared by Meta?
I would be very curious to hear your opinions.
Thanks!
At the begining, I am a newcomer in speech so I couldn't explained well. And I update some demo, you can listen to those. When training, I'm not add LM model and use the balancer. But I found the result isn't bad.
the audio is used our checkpoint which trained in LibriTTS 960h and 16epochs.
When training, I also found the vq loss probably not very important because it didn't converge...
Maybe there are some small bug
Thanks for the insights!