How to extract prosody features of wavs using festival for indic voices (Hindi)?

Question

How to extract prosody features of wavs using festival for indic voices (Hindi)?

skmalviya opened this issue 5 years ago · comments

How Intonational (F0 etc) and Duration (syllable accent) feature files would be generated for both training and testing wavs using a Trained "Hindi" Clustergen model.

Thanks in advance.

Sai Krishna · Answer 1 · Sun Jun 23 2019 02:22:35 GMT+0800 (China Standard Time)

The utterance data structure structure has this information. Utterance structure is stored in the directory 'festival/utts' as part of the build process. The information you desire can be dumped at multiple levels. For instance, to dump phone level duration information for a sample file A.utt, here is the command:

$FESTDIR/examples/dumpfeats -feats dur.feats -relation "Segment" festival/utts/A.utt -output dest_dir/desired_fname.dur

Here's a breakdown:
$FESTDIR/examples/dumpfeats -> utility to dump features from utterance structure
-feats -> argument to specify a file that has the desired features
dur.feats -> File that has the desired features one per line. Example:

name
segment_start
segment_end

-relation -> The level at which to dump the features. In Festival, 'segment' refers to the phoneme level.
festival/utts/A.utt -> File that we desire to dump info about
-output dest_dir/desired_fname.dur -> destination location

To dump info about all files, the following command should help:
$FESTDIR/examples/dumpfeats -feats dur.feats -relation "Segment" festival/utts/A.utt -output dest_dir/%s.dur

Explanation here too:
http://festvox.org/bsv/x689.html

Hope this isnt too late :)

Lmk if any issues