microsoft / Pengi

An Audio Language model for Audio Tasks

Home Page:https://arxiv.org/abs/2305.11834

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Information about Evaluation

asif-hanif opened this issue · comments

Hi,
Thanks for the great work.
I have a question regarding the evaluation on US8K dataset. This dataset has 10 folds and its website recommends using 10-fold cross validation to obtain average test results. Could you please confirm if you used 10-fold cross validation. I have evaluated pengi model on each fold separately and average accuracy across these folds do not match with the number reported in paper (i.e. Accuracy=0.7185 on US8K from Table 3). I get average accuracy around 0.55.

Hi @asif-hanif, make sure you are resampling the dataset to 44.1 kHz. The model performance drops when a different sampling rate is used.