About detecot_loss

Question

About detecot_loss

foxkw opened this issue a year ago · comments

HI. thanks for your great job.

I found that warped_detector_ Loss is difficult to converge when training superpoint . because the image content changes with each epoch, and the original image completely converges. Is there any good way to solve this problem

foxkw · Answer 1 · Thu Mar 16 2023 16:10:38 GMT+0800 (China Standard Time)

HI. thanks for your great job.

I found that warped_detector_ Loss is difficult to converge when training superpoint . because the image content changes with each epoch, and the original image completely converges. Is there any good way to solve this problem

The green point is gt, and the red point is prediction

Rémi Pautrat · Answer 2 · Thu Mar 16 2023 17:08:48 GMT+0800 (China Standard Time)

Hi, ideally, both the source and target image should be warped during training, to improve the diversity of the training images. I think that currently, when training on pairs of images, only the target image is warped. But it should not be too difficult to also warp the source one as well.

foxkw · Answer 3 · Mon Mar 27 2023 23:31:26 GMT+0800 (China Standard Time)

Hi, ideally, both the source and target image should be warped during training, to improve the diversity of the training images. I think that currently, when training on pairs of images, only the target image is warped. But it should not be too difficult to also warp the source one as well.

Hi, "I warpeded the source image and the corresponding target image, but when my homography only uses rotation, adding some data augmentation, the loss can converge very well. However, when my homography uses rotation, translation, and scaling, the warp loss cannot converge very well. There are pixel errors in gt and prediction. I tried to debug the homography parameters, but found it difficult to solve this problem. Have you ever encountered this situation, or is it normal ?

As shown in the figure below, most points can be completely corresponding, but some points do not. The green point is a gt, and the red is a prediction. If gt is the same as the prediction, the green point covers the red point

Rémi Pautrat · Answer 4 · Tue Mar 28 2023 00:32:26 GMT+0800 (China Standard Time)

Hi, the prediction will never be perfect anyway, and the current one seems pretty good in my opinion. Since the green point can cover the red ones, it is hard to judge if the prediction are detecting most corners. But they are at least very precise, in the sense that all predicted keypoints are located on corners.

foxkw · Answer 5 · Wed Mar 29 2023 22:13:31 GMT+0800 (China Standard Time)

Hi, the prediction will never be perfect anyway, and the current one seems pretty good in my opinion. Since the green point can cover the red ones, it is hard to judge if the prediction are detecting most corners. But they are at least very precise, in the sense that all predicted keypoints are located on corners.

Sorry to bother you again. When I train model with a large amount of data, I found that ori_ image detec_ Loss is about 0.05, while warp detect_ Ioss is 0.3 . It seems to be still warped_ img cannot learn well. Is this normal ?

Rémi Pautrat · Answer 6 · Wed Mar 29 2023 22:21:36 GMT+0800 (China Standard Time)

Hi, orientation only and full warp are two different tasks, and the latter is harder. So it makes sense that your warp detect_loss is higher. I don't think there is any issue there, the gap only reflects the fact that training with warp is harder than training with rotation only.

You should also not rely only on the train/val loss to judge the quality of your model, but evaluate it on a proper test set. You could compare the repeatability / localization error of both models on two types of data: with rotation only, and with full warps.

Shreyas Ramesh · Answer 7 · Wed Feb 07 2024 10:37:48 GMT+0800 (China Standard Time)

Hi @rpautrat , I wanted to inquire why do you multiply labels to 2 in detector loss and append ones to the 65th channel of it?
https://github.com/rpautrat/SuperPoint/blob/d8ebb9040fac489e23dd0b6f136976c329eed3ba/superpoint/models/utils.py#L59C1-L60C1
https://github.com/rpautrat/SuperPoint/blob/d8ebb9040fac489e23dd0b6f136976c329eed3ba/superpoint/models/utils.py#L59C1-L60C1

Rémi Pautrat · Answer 8 · Wed Feb 07 2024 17:01:06 GMT+0800 (China Standard Time)

Hi, the idea is the following:

Appending the ones allows to insert a label for the "no keypoint" dustbin in every 8x8 patch. Multiplying by 2 gives more weight to the ground truth labels. So after this operation, labels will contains either 0s (for pixels where there was no keypoint in the ground truth), always a 1 for the dustbin, and 2s where there was a ground truth keypoint. So when we take the argmax in the next line, there are 2 cases: either there was a keypoint in the patch in the ground truth, in which case the argmax will pick the corresponding cell with a 2 in labels, or there was no keypoint (so labels was full of zeros) and the dustbin cell is selected instead.

I hope this is clear.

Shreyas Ramesh · Answer 9 · Thu Feb 08 2024 03:39:06 GMT+0800 (China Standard Time)

Thank you so much for the swift response! I am investigating MagicPoint performance as my SuperPoint performance was unsatisactory. This is the output of MagicPoint trained on Synthetic Dataset (120k images)

And this is from GFTT (Shi Tomasi)

As you can see more interpretable keypoints from GFTT rather than MagicPoint. Clearly the patterns in the giraffe points to polygons which it has already trained. Furthermore, the losses ave reduced drastically making me think that the model has overfitted to synthetic shapes. Do you think this is normal? What suggestions you may have, given your extensive experience working on this?

Rémi Pautrat · Answer 10 · Thu Feb 08 2024 15:57:50 GMT+0800 (China Standard Time)

Hi, yes this is a generalization problem that can be expected for the model trained on synthetic data. What you could do is:

Reduce the threshold to accept more points. Currently you are only displaying the points where the model is the most confident, but with a threshold of 0.001 for example, you might see more points.
If you think the model is overfitting on the synthetic shapes, add more data augmentation in the synthetic training, and do early stopping, i.e. stop training as soon as the validation loss stops decreasing, or even already when it has almost converged.