Add scope document

Question

Add scope document

fulldecent opened this issue 4 years ago · comments

Hello Greg! Thanks for sharing. I'm glad to see your interest here and further research into this project.

As for the tooling, I believe Formant Analyzer does what you are describing. It works best of course if there is good quality audio coming in (decent microphone, pointed in correct direction, with decent gain, without excessive amplification). Also, word choice can help. It is calibrated on expecting monosyllabic utterances. "Hiiii" will work well, "Helloooooooo" will not. You can see some of this because there are four screens in the app, and basically they are all the debugging steps I use to review how it works.

I originally made the app to help my wife with accent reduction (born in China, living in USA). And it mostly would a home for speech pathologists (adults that could speak but now have less function, children with difficulty learning).

For your own work, please do publish the vowel chart you have created. Include F1/F2, how you measured it, demographics of the speakers (age, gender, language, region, type of speaking (casual, singing), anything else relevant, how you measured F1/F2). If you can also upload sample audio files this will be ideal because researchers use this and I can add them as test files for the project.

The algorithms we are using are based on research we found on MathWorld. These are all implemented originally in Matlab and then they are ported over to iOS. And the results are compared on sample files to validate the implementation correctness. I believe Formant Analyzer is better than Pratt for the specific case of unattended monosyllabic plotting of L1/L2. But if there are some things we can learn from Pratt I'm happy to check it out.

Also I know our vowel plots are an item to improve. I don't see good primary sources on this. And there seems to be many regionally-relevant publications on this (for Chinese speakers, European, etc.) So I'm not sure what to do there. But either way, the L1/L2 readout will be the same. Only the relative placement on the vowel chart will change.