festvox / festival

Festival Speech Synthesis System

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to extract prosody features of wavs using festival for indic voices (Hindi)?

skmalviya opened this issue · comments

How Intonational (F0 etc) and Duration (syllable accent) feature files would be generated for both training and testing wavs using a Trained "Hindi" Clustergen model.

Thanks in advance.

The utterance data structure structure has this information. Utterance structure is stored in the directory 'festival/utts' as part of the build process. The information you desire can be dumped at multiple levels. For instance, to dump phone level duration information for a sample file A.utt, here is the command:

$FESTDIR/examples/dumpfeats -feats dur.feats -relation "Segment" festival/utts/A.utt -output dest_dir/desired_fname.dur

Here's a breakdown:
$FESTDIR/examples/dumpfeats -> utility to dump features from utterance structure
-feats -> argument to specify a file that has the desired features
dur.feats -> File that has the desired features one per line. Example:

name
segment_start
segment_end

-relation -> The level at which to dump the features. In Festival, 'segment' refers to the phoneme level.
festival/utts/A.utt -> File that we desire to dump info about
-output dest_dir/desired_fname.dur -> destination location

To dump info about all files, the following command should help:
$FESTDIR/examples/dumpfeats -feats dur.feats -relation "Segment" festival/utts/A.utt -output dest_dir/%s.dur

Explanation here too:
http://festvox.org/bsv/x689.html

Hope this isnt too late :)

Lmk if any issues