sony / bigvsan

Pytorch implementation of BigVSAN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the SAN losses

sh-lee-prml opened this issue Β· comments

Thanks for nice work πŸ‘

Now, I'm trainining my TTS model by replacing BigVGAN with BigVSAN.

First, the model with BigVSAN shows slightly better results in early steps! πŸš€πŸš€πŸš€

I have attached tensorboard graph and I was wondering if maybe you could have similar results during training.

image

Have you tried to tune the hyper-parameter for gen loss or feature matching loss? The scales of these losses are quite different from the baseline model (BigVGAN),

image

Thanks again 😊

Thank you for your interest and sharing your learning curves!

That's a good question. There should be room for improvement in hyperparameter setting. We didn't conduct any hyperparameter tuning, and just compared BigVSAN and BigVGAN, giving the same hyperparameter values. We're surely interested in how largely the performance will improve after elaborate hyperparameter tuning, but we're spending our time doing other things now.

Could you share your learning curves?

I hope to have a handle on training SAN πŸ‘€

Here they are!
image
image

We didn't record a loss for each discriminator as you're doing. We have only information on total losses for multiple discriminators. Light blue curves are for our BigVSAN, and pink ones for our BigVGAN reproduction. I hope these are informative.