viveksck / simplicity

Code and Data for Simple Models for Word Formation in English Slang

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About IPA alphabet

loretoparisi opened this issue · comments

Thanks a lot for this amazing and inspiring work. I'm currently working on a Tensor2Tensor like LSTM encoder/decoder G2P, but using the CMU 2 IPA dictionary / alphabet. In your model you are using the standard CMU/Arpabet, but what about using IPA instead? - see https://github.com/loretoparisi/docker/tree/master/g2p-seq2seq

Thanks for letting me know! When we were writing the paper, we used a pre-trained model (on the standard CMU data-set) mainly due to timing constraints. Indeed, we note in our paper that using IPA might improve the models (since stress is accounted for). We will be looking at the IPA based models you pointed out as well! Thanks!

@viveksck you are welcome! I will let you know the results of the model training. We are now working on the Tensor2Tensor Seq2Seq architecture by CMU and the CMU 2 IPA dict.