EricWilbanks / faseAlign

Command line tool for forced-alignment of Spanish speech data

Home Page:http://fasealign.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comparison of alignment accuracy

ABC0408 opened this issue · comments

Thank you for your work.
I'm curious about the accuracy of this alignment with montreal-forced-aligner.
I don’t know if you have done this comparison.

Thanks for the question. I haven't done any accuracy comparisons, but I suspect the MFA models are slightly more accurate, since the training corpus seems to be larger and they're using the more recent Kaldi framework, compared to the earlier HTK framework that faseAlign uses. I know that Kaldi can support DNN acoustic models (which could be an improvement over the HMM-based models of faseAlign), but I am not sure if that is what the MFA is running under the hood.

From my cursory reading, here are the features/pros that distinguish faseAlign and MFA from each other:

MFA -

faseAlign-

  • Fully developed Spanish pronunciation dictionary
  • Linguistically informed ortho-to-phon mapping specifically for Spanish
  • Support output of syllable-level information, including syllable boundaries and automatically marking tonic syllables
  • Allows for .txt inputs as well (in addition to TextGrids, which MFA also supports)
  • Allows for multi-speaker .txt inputs without turn boundary timestamps (see 2.1.2)
  • Support for variable stereo mapping

Finally, this is neither a pro nor con, but I suspect on the basis of the phoneme inventory that the MFA spanish acoustic model is based off of a Castilian dialect (suspecting here that corresponds to an interdental), while the language model and phoneme inventory of faseAlign is Latin American Spanish. Depending on your test case, this could be relevant (though for shared phones you're likely not to notice a drop in performance using a cross-dialect model).

I'm happy to answer more questions, but again coming from a place of not having used MFA personally (I love their work though!)

Thank you for your reply. I benefit a lot from your work.