About the audio-text pair of AudioSet dataset.

Question

blue-blue272 opened this issue 3 months ago · comments

AudioSet only contains audio and event labels. How do you obtain the caption description for audios in the audioset dataset?

Yuan Gong · Answer 1 · Sun Jun 30 2024 03:11:39 GMT+0800 (China Standard Time)

Please check this: https://github.com/XinhaoMei/WavCaps. It is in the paper, but probably not very obvious place.

-Yuan