David R. Mortensen's repositories
chatgpts-wugs
Code and data for the paper ”Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model“"
morphotactics
Library for implementing morphotactic FSTs using Pynini and OpenFST
sch-corpus
A Hmong language corpus derived from the soc.culture.hmong Usenet group
syllabiphon
Library for sonority-based syllabification of arbitrary IPA strings
epitran.rs
A G2P library built in Rust as a successor to Epitran
elab-order
Data and code for the elaborate expression ordering project
epitran-online
Epitran Online
kairos-yaml
Conversion between DARPA KAIROS data formats and YAML
subwordmodeling
Website for the Subword Modeling course
syllabletk
Syllabification tools for natural language
emnlp2023-monolingual
Refined stylesheet for EMNLP2023 (no multilingual support)
pronouncing-chinese-names
A pronunciation guide for Chinese names intended for speakers of American English
TextGridTools
Read, write, and manipulate Praat TextGrid files with Python
tufte-latex-minion-math
A Tufte-inspired LaTeX class for producing handouts, papers, and books