StoryTTS

STORYTTS: A HIGHLY EXPRESSIVE TEXT-TO-SPEECH DATASET WITH RICH TEXTUAL EXPRESSIVENESS ANNOTATIONS

StoryTTS is a highly expressive text-to-speech dataset that contains rich expressiveness both in acoustic and textual perspective, from the recording of a Mandarin storytelling show (评书), which is delivered by a female artist, Lian Liru(连丽如). It contains 61 hours of consecutive and highly prosodic speech equipped with accurate text transcriptions and rich textual expressiveness annotations.

Demos

Dataset Statistics

Download

Please download the speech data from Google Drive (5.06 GB)

Note
- The dataset is ONLY for research purposes.
- The ownership of the speech data remains with the original owner. Downloading this dataset defaults to agreeing to sign our licensing agreement. lt's important to note that these materials may be removed at any time upon request from the original owner.

File Description

dataset/transcript : The transcripts of StoryTTS in simplified Chinese with puncuations.
dataset/utt2dur: The duration (in seconds) of each utterance.
dataset/utt2spk: The speaker name of each utterance, i.e. the name of the only speaker in StoryTTS.

dataset/label : The annotation labels of StoryTTS. The format of this file is as follows:

utt-ID 句式(Sentence Pattern)|修辞手法(Rhetoric Device)|场景(Scene)|情感色彩(Emotional colors)|模仿人物(Imitated Characters)

dataset/prompt_claude2: Prompt and instruction for Claude2.
dataset/prompt_gpt4: Prompt and instruction for GPT4.
dataset/wav.scp: Path of wav files. Note: might be changed according to your location of storing the speech data.

Citation

@inproceedings{storytts,
  author={Sen Liu and Yiwei Guo and Xie Chen and Kai Yu},
  title={{StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations}},
  year={2024},
  booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={11521-11525},
  doi={10.1109/ICASSP48485.2024.10446023}
}

About

[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

https://goarsenal.github.io/StoryTTS/

Other

Languages

Language:HTML 100.0%