syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

preprocessing the training data

marymirzaei opened this issue · comments

Thank you very much for your nice work.
I have a problem with preprocessing the training data. The transcript file for Blizzard2013 segmented data is a file named prompts.gui which can be found here:
https://www.dropbox.com/s/6ugwnbqgwlfvxvl/prompts.gui?dl=0
I was wondering how the metdata.train file should look like. It seems that I need to clean up the attached file to be used for training and match the criteria. Is it possible to upload your cleaned up 'metadata-train' file, the converter of prompt.gui to metadata-train, or the desired format of the metadata.train file?

Hi, I just simply extract the text from the prompts.gui, ignoring other information like prosody.

You can get the file format from the attachment.
metadata.zip

Hi, I just simply extract the text from the prompts.gui, ignoring other information like prosody.

You can get the file format from the attachment.
metadata.zip

Do you know what the other information is? I can't understand what the 3rd line in prompt.gui mean. Following is an example

CA-BB-01-01
Black Beauty @ : # the Autobiography @ of a Horse . #
B L 62iHfN KcF _ B y13iHfW ^ T Y2iLfN @ : || _ DH Y2iLfN cYa _ 33iHfN ^ T N42iLfN ^ B 6y2iLfN cY ^ 42iHfN ^ GcS R N41iLfN ^ F Y1iLfN cYa @ _ N41iLfN VcD _ N41iLfN _ H 32iHfW R ScT . ||