Questions about running validate_lfw() function in train_triplets_loss.py
riverHu233 opened this issue · comments
Hello @tamerthamoqa
I use the validate_lfw() functions in my faceNet project, without changing anything, but when evaluating, it tooks almost 2 hours to calculate the distances and other metrics and still I didn't get a results, so the first question is I want to know if evaluating costs a lot of time, cause it computes on CPU instead of GPUs, and if it does, evaluate every epoch would costs, so I wonder how long does it take to train the whole model, it would very thankful if you can share me the training details so I can figure if there something uncorrect with my code.
Thanks Sincerely!
Hello riverHu233,
For my system (Titan RTX), a training epoch of 5000 training iterations with 140x140 images with 544 triplets per batch would take around 2 hours 11 minutes with maximum performance mode, the LFW evaluation would take around 10-15 seconds. You could multiply by around 30 as an approximate for CPU computation time.
I am not sure on the reason why it is taking so long at your end to be honest.
P.S: I also have an SSD.
Thanks for your rely, there must be something wrong with my code, cause the hardware environment is almost same with yours, so maybe I should debug the code, Thanks again!
Thanks @tamerthamoqa , you're right, the validation only take 10 seconds in my code. It was the shape dismatch between distances (16200, ) and labels (16200, 1), but it didn't raise any error and the program just never ends.