Missing transcripts?

Question

Missing transcripts?

chenht2021 opened this issue 6 months ago · comments

I read the FAQ on page. But I still find missing some transcripts, for example, the speaker 3D_SPK_00001 does not exist in transcription/train_transcription or transcription/test_transcription.
I missed something?
Or it just provides some transcripts.

Luyao Cheng · Answer 1 · Thu Jan 11 2024 17:21:37 GMT+0800 (China Standard Time)

Currently, our text annotations are only available for audio clips recorded with DIRECTIONAL devices. The reason for this is that we focus on annotating clear and distinct audio rather than using audio data that is not as clear, such as those from far-field recordings or in dialects. Our dataset is more focused on speaker-related tasks. If further text annotation releases, we will update the information on our website.

Haitao · Answer 2 · Thu Jan 11 2024 17:53:02 GMT+0800 (China Standard Time)

Thanks for your explanation.
Ok, maybe off topic, if not appropriate, pls close it.
I read LAURAGPT, It says the the trainning data of TTS is LibriTTS and 3D-Speaker, and copied it 2 times, so the number of samples is 5.0M.
LibriTTS train set is about 206K, and all 3D-Speaker's train set is about 643k, if count annotations, it will be less.
So the number of samples for trainning TTS is wrong? should be 500k?

Luyao Cheng · Answer 3 · Fri Jan 12 2024 11:02:14 GMT+0800 (China Standard Time)

In the experiment with LauraGPT, data from the highest quality device of 3D-Speaker Datasets was utilized, and certain data augmentation was performed. For specific data details, please refer to the original paper.

Luyao Cheng · Answer 4 · Fri Jan 12 2024 11:17:06 GMT+0800 (China Standard Time)

After double-checking with the authors, it appears that the LibriTTS data you provided seems to be smaller than expected. Additionally, we have also utilized data from aishell-1,2,3 in the TTS tasks, which was inadvertently omitted in the current preprint version of our paper. We will rectify this detail in our subsequent revisions.