How to extract prosody features of wavs using festival for indic voices (Hindi)?
skmalviya opened this issue · comments
How Intonational (F0 etc) and Duration (syllable accent) feature files would be generated for both training and testing wavs using a Trained "Hindi" Clustergen model.
Thanks in advance.
The utterance data structure structure has this information. Utterance structure is stored in the directory 'festival/utts' as part of the build process. The information you desire can be dumped at multiple levels. For instance, to dump phone level duration information for a sample file A.utt, here is the command:
$FESTDIR/examples/dumpfeats -feats dur.feats -relation "Segment" festival/utts/A.utt -output dest_dir/desired_fname.dur
Here's a breakdown:
$FESTDIR/examples/dumpfeats -> utility to dump features from utterance structure
-feats -> argument to specify a file that has the desired features
dur.feats -> File that has the desired features one per line. Example:
name
segment_start
segment_end
-relation -> The level at which to dump the features. In Festival, 'segment' refers to the phoneme level.
festival/utts/A.utt -> File that we desire to dump info about
-output dest_dir/desired_fname.dur -> destination location
To dump info about all files, the following command should help:
$FESTDIR/examples/dumpfeats -feats dur.feats -relation "Segment" festival/utts/A.utt -output dest_dir/%s.dur
Explanation here too:
http://festvox.org/bsv/x689.html
Hope this isnt too late :)
Lmk if any issues