tamerthamoqa / facenet-pytorch-glint360k

A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about running validate_lfw() function in train_triplets_loss.py

riverHu233 opened this issue · comments

Hello @tamerthamoqa
I use the validate_lfw() functions in my faceNet project, without changing anything, but when evaluating, it tooks almost 2 hours to calculate the distances and other metrics and still I didn't get a results, so the first question is I want to know if evaluating costs a lot of time, cause it computes on CPU instead of GPUs, and if it does, evaluate every epoch would costs, so I wonder how long does it take to train the whole model, it would very thankful if you can share me the training details so I can figure if there something uncorrect with my code.
Thanks Sincerely!

Hello riverHu233,

For my system (Titan RTX), a training epoch of 5000 training iterations with 140x140 images with 544 triplets per batch would take around 2 hours 11 minutes with maximum performance mode, the LFW evaluation would take around 10-15 seconds. You could multiply by around 30 as an approximate for CPU computation time.

I am not sure on the reason why it is taking so long at your end to be honest.

P.S: I also have an SSD.

Thanks for your rely, there must be something wrong with my code, cause the hardware environment is almost same with yours, so maybe I should debug the code, Thanks again!

Thanks @tamerthamoqa , you're right, the validation only take 10 seconds in my code. It was the shape dismatch between distances (16200, ) and labels (16200, 1), but it didn't raise any error and the program just never ends.