lingz / cmudict-ipa

CMUDict encoded as IPA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CMUDict encoded in IPA

File is tab separated and found in cmudict.ipa

Notes:

  • Parenthesis deleted for words with multiple pronounciations
  • Emphasis deleted
  • Split into 10% dev, 10% test, 80% test data set in datasets/

Can modify mappings found in arpa-ipa.map. Mappings taken from wikipedia: https://en.wikipedia.org/wiki/Arpabet

About

CMUDict encoded as IPA

License:MIT License


Languages

Language:Python 100.0%