Fixed bug in the training code!

Question

Fixed bug in the training code!

corkillj opened this issue 3 years ago · comments

Possible bug in the provided training code?

I have heavily modified the training "loop" and moved the project to a more recent pytorch lightning version. But when training I noticed that it didn't learn the landmarks correctly (they would converge to the center of the bbox). After many hours of debugging I believe I found the issue in the encoding function for the landmarks in box_utils.py:

There seem to be missing parentheses in the encode_landm function, where the code is supposed to divide by the variance times the prior boxes, it divides by only the variance and then multiplies by the prior boxes.
This fix now puts everything in line with the respective landmark decoding function that multiplies by both the priors and the variance. I am just curious how this made it into the repo, because I'm assuming that the pertained weights were achieved with this code?

Original:

g_cxcy = matched[:, :, :2] - priors[:, :, :2]
# encode variance
g_cxcy = g_cxcy // variances[0] * priors[:, :, 2:]
# return target for smooth_l1_loss
return g_cxcy.reshape(g_cxcy.size(0), -1)

Modified:

g_cxcy = matched[:, :, :2] - priors[:, :, :2]
# encode variance
g_cxcy = g_cxcy / (variances[0] * priors[:, :, 2:])
# return target for smooth_l1_loss
return g_cxcy.reshape(g_cxcy.size(0), -1)

relevant decode_landm part:

return torch.cat(
        (
            priors[:, :2] + pre[:, :2] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 2:4] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 4:6] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 6:8] * variances[0] * priors[:, 2:],
            priors[:, :2] + pre[:, 8:10] * variances[0] * priors[:, 2:],
        ),
        dim=1,

Wok · Answer 1 · Fri Nov 19 2021 19:25:02 GMT+0800 (China Standard Time)

For info, there is another repository which is actively maintained by a researcher in the field: https://github.com/xinntao/facexlib
I also started by using ternaus/retinaface (for inference only), but ended up preferring using xinntao/facexlib in the end.

I mention that because I see that you have already found 2 bugs:

#37
#38

Jason · Answer 2 · Fri Nov 19 2021 20:11:33 GMT+0800 (China Standard Time)

Thanks a lot for mentioning it! Bummer that I didn't find this earlier, because now I think I'm way to deep into my project to switch. But I do like that this repo here is using Lightning!
Plus I dont see a training script anywhere in the facelib repo? or did I just not look closely enough?

Wok · Answer 3 · Fri Nov 19 2021 20:24:14 GMT+0800 (China Standard Time)

Ah, I think you are right!

Vladimir Iglovikov · Answer 4 · Tue Jan 25 2022 03:51:08 GMT+0800 (China Standard Time)

It is a good question how did this happen. I think I copied this part of the code from the original repo without changing this part of the code.

FIxed.