microsoft / MaskFlownet

[CVPR 2020, Oral] MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask

Home Page:https://arxiv.org/abs/2003.10955

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Finetuning on KITTI dataset with sparse ground truth

DeepDeep123 opened this issue · comments

Hi, authors,

Thanks for sharing the code, it is a great work!

When fine-tuning on KITTI dataset, it only has sparse ground truth. In this case, If we employ some geometric transformation such as scale and roatation with biliear sampling, it will cause problem, because there are many zeros for those non-labeled pixels. Besides, the binary mask will become non-binary anymore.

In your paper, you state that For sparse ground-truth flow in KITTI, the augmented flow is weighted averaged based on the interpolated valid mask. But in the code, I cannot find how you handle this in detail. Could you please tell me how employ geometric transformations on sparse ground truth datasets (e.g, interpolated valid mask)?

Thanks for your attention!

Hi, thanks for your interest in our work!

The main idea here is to apply geometric transformations on the images, the flow, and the mask simultaneously. In both the flow and the mask, invalid regions are all 0, so, after the geometric transformations, you can simply divide the flow by the mask to get the weighted averaged ground-truth (over valid pixels). As examples, see augmenation.py, line 314, and reader/kitti.py, line 72, where you can find detailed implementations.

I hope this solves your question! :)