Alibaba-MIIL / ImageNet21K

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

data deduplicate

pawopawo opened this issue · comments

Has the ImageNet1K validation data and Imagenet21K training data been de-duplicated?

commented

the validation set of ImageNet1K is not in Imagenet21K train set, i double-checked that.
so this is a "fair" scenario

How can I check if there are duplicates?

commented

i am sure the internet contains guides on that.
I did a loop-for on the validation set files