basveeling / pcam

The PatchCamelyon (PCam) deep learning classification benchmark.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset file MD5 checksum problem

franciszchen opened this issue · comments

Hello,
I download the Pcam, and compute the MD5 checksum of files by md5sum on Ubuntu 16.04. The relationship between files and MD5 seems promiscuous.
E.g. on my computer, the MD5 checksum of camelyonpatch_level_2_split_valid_x.h5.gz is d5b63470df7cfa627aeec8b9dc0c066e. But on the github, its MD5 is said to be d8c2d60d490dbd479f8199bdfa0cf6ec, and d5b63470df7cfa627aeec8b9dc0c066e is said to be MD5 of camelyonpatch_level_2_split_test_x.h5.gz. So maybe the map from files on the github to MD5 is wrong. If possible, plz check whether there is a mistake. Thank u~

$ md5sum *
3455fd69135b66734e1008f3af684566  test_meta.h5.gz
d8c2d60d490dbd479f8199bdfa0cf6ec  test_x.h5.gz
60a7035772fbdb7f34eb86d4420cf66a  test_y.h5.gz
5a3dd671e465cfd74b5b822125e65b0a  train_meta.h5.gz
1571f514728f59376b705fc836ff4b63  train_x.h5.gz
35c2d7259d906cfc8143347bb8e05be7  train_y.h5.gz
67589e00a4a37ec317f2d1932c7502ca  valid_meta.h5.gz
d5b63470df7cfa627aeec8b9dc0c066e  valid_x.h5.gz
2b85f58b927af9964a4c15b8f7e8f179  valid_y.h5.gz

Seconded, looks like the test and validation sets got mixed up, probably not an issue practically since those two are supposed to be interchangeable if I understand correctly.