How do you train the mfa acoustic model?

Question

How do you train the mfa acoustic model?

SandroChen opened this issue a year ago · comments

I follow your tips on training the mfa acoustic model but cannot get labels on aishell3 as accurate as the one you offered.
I see there is 'sp' in the alignment result and its position is suprisingly accurate. I compared it with the one aishell3 dataset itself offered, and find the one you offered is more accurate. For example:

The sentence "广州%女大学生%登山%失联%四天%警方%找到%疑似%女尸$"。After listening to the original wave file, I find there should be a pause between "登" and "山", which is missed by this dataset. But in the textgrid files that you offer, there is a "sp" between the phones "eng1" and "sh". I wonder how you success to produce this accurate pause label? The mfa acoustic model that I train on aishell3 does not produce this label.

Ray · Answer 1 · Mon Feb 19 2024 21:55:39 GMT+0800 (China Standard Time)

i have same question