FranklinChen / pittsburgh-perl-workshop-2010-talk

Material for my Pittsburgh Perl Workshop 2010 talk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How do you pronounce 07-1191?

Material for my 20-minute talk at the Pittsburgh Perl Workshop 2010, given on Saturday, October 9, 2010.

Abstract

One part of the TalkBank project at Carnegie Mellon University is the parsing of text transcripts from Supreme Court oral arguments and conversion into the CHAT format (http://childes.psy.cmu.edu/manuals/chat.pdf) for linking utterances to the audio of the oral arguments, and further processing and analysis.

Since the CHAT format represents what is spoken, we were faced with the task of converting a variety of written forms such as 07-1191 and 2nd and 19.2-187 into pronounceable forms such as oh seven eleven one ninety-one and second and nineteen point two one eighty-seven.

This talk will outline the design of the converter and a natural implementation in Perl.

About

Material for my Pittsburgh Perl Workshop 2010 talk