Why choosing only the first head in pose computation
shan18 opened this issue · comments
Hi PPGeo Team,
The output of the Pose Decoder contains two heads for the axisangle
and translation
i.e. the shapes of the output are like [.., 2, ...]
. But during the calculation of cam_T_cam
, I see that only the first is ever used.
Lines 122 to 125 in bb37f52
Can you please help me clarify why the network predict two heads when only 1 of them is used? Is there any particular purpose the second head is solving because according to the code, I see that only the first head is ever used.
Yes, the second head is never used here. We set the num_frames_to_predict_for of the PoseDecoder to keep the consistency with the original MonodepthV2 model structure. You can change the num_frames_to_predict_for into 1 to only keep 1 head.
I see. Thanks a lot for the response.