Scrambled labels

Question

Scrambled labels

alma-lindborg opened this issue a year ago · comments

Hi,

Thanks for building this benchmark, very cool!
I'm trying to run inference using the datasets and the pre-trained model labels, but consistently get scrambled labels for all datasets except UT-HAR.
I think this stems from the following lines of code in the CSI_Dataset and Widar_Dataset classes:

self.data_list = glob.glob(root_dir+'/*/*.csv')
self.folder = glob.glob(root_dir+'/*/')

As you can see in the documentation, glob.glob returns results in arbitrary order. The label indices are retrieved from the indices in this arbitrarily ordered list.

I've tried to sort the lists, but this doesn't help with unscrambling the labels. Could you add documentation of the class label to index mapping so it's possible to use your benchmark?

For future experiments I'd recommend you to always apply sort() to self.folder and self.data_list in order to prevent this problem.

Best,
Alma

Jianfei Yang commented 3 months ago

@xyanchen