ZhikangNiu / encodec-pytorch

unofficial implementation of the High Fidelity Neural Audio Compression

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Able to reproduce Meta's quality?

listener17 opened this issue · comments

Did you try to replicate training of Meta?
Just curious - if it is at all possible to replicate the stuff from the code that was shared by Meta?

I would be very curious to hear your opinions.

Thanks!

At the begining, I am a newcomer in speech so I couldn't explained well. And I update some demo, you can listen to those. When training, I'm not add LM model and use the balancer. But I found the result isn't bad.

the audio is used our checkpoint which trained in LibriTTS 960h and 16epochs.
When training, I also found the vq loss probably not very important because it didn't converge...
Maybe there are some small bug

Thanks for the insights!