ternaus / retinaface

The remake of the https://github.com/biubug6/Pytorch_Retinaface

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fixed bug in the training code!

corkillj opened this issue · comments

commented

Possible bug in the provided training code?

I have heavily modified the training "loop" and moved the project to a more recent pytorch lightning version. But when training I noticed that it didn't learn the landmarks correctly (they would converge to the center of the bbox). After many hours of debugging I believe I found the issue in the encoding function for the landmarks in box_utils.py:

There seem to be missing parentheses in the encode_landm function, where the code is supposed to divide by the variance times the prior boxes, it divides by only the variance and then multiplies by the prior boxes.
This fix now puts everything in line with the respective landmark decoding function that multiplies by both the priors and the variance. I am just curious how this made it into the repo, because I'm assuming that the pertained weights were achieved with this code?

Original:

g_cxcy = matched[:, :, :2] - priors[:, :, :2]
# encode variance
g_cxcy = g_cxcy // variances[0] * priors[:, :, 2:]
# return target for smooth_l1_loss
return g_cxcy.reshape(g_cxcy.size(0), -1)

Modified:

g_cxcy = matched[:, :, :2] - priors[:, :, :2]
# encode variance
g_cxcy = g_cxcy / (variances[0] * priors[:, :, 2:])
# return target for smooth_l1_loss
return g_cxcy.reshape(g_cxcy.size(0), -1)

relevant decode_landm part:

return torch.cat(
        (
            priors[:, :2] + pre[:, :2] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 2:4] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 4:6] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 6:8] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 8:10] * variances[0] * priors[:, 2:],
        ),
        dim=1,
commented

For info, there is another repository which is actively maintained by a researcher in the field: https://github.com/xinntao/facexlib
I also started by using ternaus/retinaface (for inference only), but ended up preferring using xinntao/facexlib in the end.

I mention that because I see that you have already found 2 bugs:

commented

Thanks a lot for mentioning it! Bummer that I didn't find this earlier, because now I think I'm way to deep into my project to switch. But I do like that this repo here is using Lightning!
Plus I dont see a training script anywhere in the facelib repo? or did I just not look closely enough?

commented

Ah, I think you are right!

It is a good question how did this happen. I think I copied this part of the code from the original repo without changing this part of the code.

FIxed.