large content landmarks

Question

large content landmarks

KingStorm opened this issue a year ago · comments

Nice work!

trying to train IP_LAP with custom data, but got results: content landmark is generally larger than the pose landmark. => therefore there is a mismatch. However, if I use pretrained model , the size of resulting content landmark is correct.

training dataset: 5 min 480 x 640 video

Weizhi Zhong · Answer 1 · Fri Jun 02 2023 21:56:52 GMT+0800 (China Standard Time)

Hi, thanks for your interest.
Does the eval_L1_loss decrease to 6e-3? as described in this issue.
If not, what is the final eval_L1_loss of your training with custom data?
Since your training dataset includes 5 min 480 x 640 video, I doubt whether it is enough.

Weizhi Zhong · Answer 2 · Fri Jun 02 2023 22:01:27 GMT+0800 (China Standard Time)

Hi, thanks for your interest. Does the eval_L1_loss decrease to 6e-3? as described in this issue. If not, what is the final eval_L1_loss of your training with custom data? Since your training dataset includes 5 min 480 x 640 video, I doubt whether it is enough.

Or, Is your traning overfitting？ Compare the running loss and eval loss.

Jianhao Ye · Answer 3 · Sat Jun 03 2023 11:21:26 GMT+0800 (China Standard Time)

Hi, thanks for your interest. Does the eval_L1_loss decrease to 6e-3? as described in this issue. If not, what is the final eval_L1_loss of your training with custom data? Since your training dataset includes 5 min 480 x 640 video, I doubt whether it is enough.

Thanks for your reply. The eval L1_loss does decrease to 1e-3 level. I would consider it is overfitting enough. And I test it on the training data.

Jianhao Ye · Answer 4 · Sat Jun 03 2023 23:42:04 GMT+0800 (China Standard Time)

I have drawn sketch during training of landmark, it is reasonable:
${epoch}_{step}_pred_sketch$

Howver in inference, the sketch turns out to be mismatched between Pose and Content landmarks:

Weizhi Zhong · Answer 5 · Sat Jun 03 2023 23:54:36 GMT+0800 (China Standard Time)

I have drawn sketch during training of landmark

Hi, thanks for your interest.
Does it mean that you draw sketches during training on the training dataset, and draw sketches during inference on the testing dataset?

Jianhao Ye · Answer 6 · Sun Jun 04 2023 22:37:49 GMT+0800 (China Standard Time)

Hi, seems find out some kind of clue about the size mismatch.

Found the landmarks extracted from preprocess_video.py is just fitting the 128x128 image with no left space, Like

However, the landmarks extracted from inference_single.py have some space left in the 128x128 image, Like:

Weizhi Zhong · Answer 7 · Mon Jun 05 2023 01:20:15 GMT+0800 (China Standard Time)

Hi, thanks for your interest.
As shown in the following code:
https://github.com/Weizhi-Zhong/IP_LAP/blob/e5d8fdc1ab01a1426ac4c8cfec461ec5d024050d/preprocess/preprocess_video.py#LL251C19-L251C19
While preprocessing the LRS2 dataset, we plus 5 extra pixels to the marginal so that the normalized coordinate of most bottom landmarks is not always 1.
Similarly, in the inference:
https://github.com/Weizhi-Zhong/IP_LAP/blob/e5d8fdc1ab01a1426ac4c8cfec461ec5d024050d/inference_single.py#LL282C9-L282C9
we plus some(25) pixels, so the landmarks are within the cropping region.
Depending on your dataset and input videos, you can change the number of pixels added to the marginal region such that all landmarks are within the cropping region.

Hope this can be helpful for you.

Jianhao Ye · Answer 8 · Sun Jun 11 2023 16:30:38 GMT+0800 (China Standard Time)

thanks, fair enough.