Audio format in dataset files

Question

Audio format in dataset files

r666ay opened this issue 3 months ago · comments

Thanks for you great work on implementing FACodec!
I found the data file in https://github.com/Plachtaa/FAcodec/blob/master/data/val.txt has some labels, like speaker id, phonemes. How can I get these labels? Will these labels be auto-generated in the training process?

Songting · Answer 1 · Thu Aug 01 2024 21:28:37 GMT+0800 (China Standard Time)

It was from VCTK dataset for legacy implementation. For the current version in this repo, annotation is not required. Auto-generated labels will not be saved during training process

Ray · Answer 2 · Thu Aug 01 2024 21:32:28 GMT+0800 (China Standard Time)

It was from VCTK dataset for legacy implementation. For the current version in this repo, annotation is not required. Auto-generated labels will not be saved during training process

Thanks for your reply. What models are used to generate these annotations? I want to export the auto-generated labels.