YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the audio-text pair of AudioSet dataset.

blue-blue272 opened this issue · comments

AudioSet only contains audio and event labels. How do you obtain the caption description for audios in the audioset dataset?

Please check this: https://github.com/XinhaoMei/WavCaps. It is in the paper, but probably not very obvious place.

-Yuan