happynear / FaceVerification

An Experimental Implementation of Face Verification, 96.8% on LFW.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Washed CASIA-webface data set identities

shimen opened this issue · comments

Hi, is there a list of identities for the Washed CASIA-webface data set? There are just numbers per each identity. I would like to use several databases for training and would like to remove identities that appear at more than one database.

Any update on the labels for the CASIA-webface data set

@shimen how did you manage to unpack the dataset? I've got 5 files of 734mb each with the names .z01-.z05 and one 340mb .zip file, that does not look like a zip file. if I concatenate all these files, I get a lot of

file #435315:  bad zipfile offset (lseek):  4403961856
file #435316:  bad zipfile offset (lseek):  4403970048
file #435316:  bad zipfile offset (lseek):  4403970048
file #435317:  bad zipfile offset (lseek):  4403986432

any idea how to correctly extract all the files?

thanks in advance!

regarding the identities, I spoke with the author of the original dataset, he said they cannot release identities at this time, but told me to check their web site later. not sure what that's supposed to mean =)

that's what I thought. could you please try:

$ unzip t combined.zip

to see if there are any errors in the archive?
in my case, there are plenty, the archive seems damaged.

On 06/16/2016 03:46 AM, kihyuks wrote:

@lazydroid https://github.com/lazydroid just in case, you can
concatenate them and unzip in ubuntu:
$ cat CASIA-maxpy-clean.z01 CASIA-maxpy-clean.z02 CASIA-maxpy-clean.z03
CASIA-maxpy-clean.z04 CASIA-maxpy-clean.z05 CASIA-maxpy-clean.zip >
combined.zip
$ unzip combined.zip

@lazydroid Actually the one that I suggested before only unzip 1/5. You can try this instead:

$ zip -F CASIA-maxpy-clean.zip --out CASIA-maxpy-clean_fix.zip
$ unzip CASIA-maxpy-clean_fix.zip

This gives me around 450K images.

@shimen @lazydroid @kihyuks Could you please provide a link to the washed CASIA dataset.

@sidgan try this: http://www.down20.com/f-170364248744426

I did not make it, I have just googled the link.

@lazydroid @kihyuks @sidgan I download the dataset, but only 439,532 images, some images missing. Unpack dataset with commands:
$ zip -F CASIA-maxpy-clean.zip --out CASIA-maxpy-clean_fix.zip
$ unzip CASIA-maxpy-clean_fix.zip
Is there any advice?

How many photos and classes must be in washed casia web face?

You should use this commands to unzip multi-part zip files.
source

zip -s- CASIA-maxpy-clean.zip -O combined.zip
unzip combined.zip