hifi_gan-vctk_small vs hifi_gan-vctk_medium (release 2021-03-28)

Question

hifi_gan-vctk_small vs hifi_gan-vctk_medium (release 2021-03-28)

svenha opened this issue 3 years ago · comments

The naming confuses me a little bit. hifi_gan-vctk_small is larger (and slower) than hifi_gan-vctk_medium.

Michael Hansen · Answer 1 · Thu Apr 15 2021 21:14:36 GMT+0800 (China Standard Time)

I wondered this as well, but the labeling from the pre-trained models in the original repo has the "medium" one as vctk_v2 and "small" as vctk_v3. Based on my understanding of the config files, v2 should be larger/slower than v3.

To make it extra confusing, the small/v3 model uses a different "resblock" but more upscale channels than medium/v2, which uses a similar configuration to the universal_large/v1 model.

I may just flip the medium/small labels though if there is an obvious performance difference between the two. I've focused all my testing on the large vs. small to date.

Florian Quirin · Answer 2 · Fri Apr 23 2021 17:54:25 GMT+0800 (China Standard Time)

So I've tested medium and small for a larger number of voices, short and long sentences and small was either equal or even slower (within the error bars I guess).

Michael Hansen · Answer 3 · Tue Aug 24 2021 05:12:43 GMT+0800 (China Standard Time)

I ended up swapping the medium/low vocoder labels in v0.5