wvangansbeke / Sparse-Depth-Completion

Predict dense depth maps from sparse and noisy LiDAR frames guided by RGB images. (Ranked 1st place on KITTI) [MVA 2019]

Home Page:https://arxiv.org/pdf/1902.05356.pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why confidence map and guidance map are able to correct mistakes?

jiangwei221 opened this issue · comments

Hi,
Looking forward to the release!

I have some questions about the paper.

  1. I don't understand why confidence(uncertainty) map and guidance map are able to correct mistakes in the ground truth? In a general setting, unguided or guided, I think CNN is able to handle a small amount of error in the ground truth.

  2. How many channels do the guidance map have(one of the global net output)? In the figure it says 1216x256x1, I'm wondering did you try to increase the number of channels, how does them perform?

  3. As for the pretrained ERFNet on cityscape dataset, what is the setting for the pretraining? And have you try the depth completion without pretraining?

Thank you!

Hi,

  1. It all depends on the receptive field. The global network can detect global changes due to its large receptive field. However, the local network (with a small receptive field) only needs to perform some sort of interpolation. Important to know is that the input LiDAR frame contains local mistakes which are hard to detect with a small network. The global network can inform the local network which LiDAR points seem inconsistent.

  2. You can use more than 1 channel if you want but I don't think it matters that much.

  3. I downloaded a pretrained network which was accessible from the ERFNet github page, but I don't know if it's available anymore. Pretraining worked slightly better in my case though.

Hope this helps.
Kind regards,
Wouter.

Thanks for the detailed explaination! It's very helpful!