Extra 2D Head

Question

Extra 2D Head

alfinnurhalim opened this issue 2 years ago · comments

Hi Danila,
I want to try your extra 2d Head to estimate the camera pose of my dataset (Indoor SUN-RGBD format), could you please elaborate more on how and where did you implemented it?
how did you 'connect' this extra network to the main network?
is it separated from the standard indoor_dataset head?

Thank you In advance

Danila Rukhovich · Answer 1 · Wed Jan 19 2022 21:26:07 GMT+0800 (China Standard Time)

Hi @alfinnurhalim ,

We simply call head_2d after the call of ResNet backbone. This head is implemented here and it contains only 2 MLPs. To try it without 3d detection you can get features_2d here, print them and return. You can even try to remove neck, neck3d and bbox_head to make the model weight less.

Alfin Nurhalim · Answer 2 · Fri Jan 21 2022 06:46:38 GMT+0800 (China Standard Time)

Why is it not also predict the yaw of the cam ?
One more thing, the ground truth for the layout loss is the size of the layout which is the size of the voxel times the number of voxel, is that correct? Thank you

Danila Rukhovich · Answer 3 · Fri Jan 21 2022 14:46:03 GMT+0800 (China Standard Time)

Hi @alfinnurhalim ,

We follow Total3dUnderstanding paper and the benchmark assuming yaw=0. For more details you can probably follow their paper / code. We adapt their angles -> extrinsic matrix transformation here.

No, the number of voxels times voxels size is fixed for each dataset. The layout here is the actual size of the room, limited by its walls, floor and ceiling. We also follow Total3dUnderstanding code for ground truth layout estimation, you can find this info in *.json.

Alfin Nurhalim · Answer 4 · Mon Jan 24 2022 07:23:14 GMT+0800 (China Standard Time)

Hi @filaPro,
Thanks for the clarification!. I will look further into their paper for more info.
Thank you very much