r9y9 / nnmnkwii

Library to build speech synthesis systems designed for easy and fast prototyping.

Home Page:https://r9y9.github.io/nnmnkwii/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug in parameter generation

hyama5 opened this issue · comments

Hello, I found a bug in paramgen.mlpg. Specifically, the generated parameters of the beginning and end of utterance become small values even if the static mean has a large value. Following Google Colab is an example of the strange MLPG behavior.
https://colab.research.google.com/drive/1C5TzPjaDRwDKuOV_XmeCmMAnX_QxEH3P

This might be caused by using the distributions of dynamic features of the first (t=0) and final (t=T-1) frames for MLPG, although these distributions cannot be defined without using the values of frames t=-1 and t=T.
Merlin overcomes this problem by giving a very large value (100000000000) to the variance of the first and final frames.
https://github.com/CSTR-Edinburgh/merlin/blob/master/src/frontend/mlpg_fast.py

If you don't mind, I'll make PR to fix this issue.

Thank you for the detailed report! Sure, I’d appreciate it if you make a PR for fixing the issue.

I fixed MLPG by changing the precision of the frames of the beginning and end. This code assumes that the first window is a static feature.
I'm not sure it works in MGE training and other modules.

fixed by #96

I want to include the fixes to the release as soon as possible, so I'm going to make a release after #98 merged.