data deduplicate
pawopawo opened this issue · comments
pawopawo commented
Has the ImageNet1K validation data and Imagenet21K training data been de-duplicated?
Tal commented
the validation set of ImageNet1K is not in Imagenet21K train set, i double-checked that.
so this is a "fair" scenario
pawopawo commented
How can I check if there are duplicates?
Tal commented
i am sure the internet contains guides on that.
I did a loop-for on the validation set files