How is the translation solved?

Question

How is the translation solved?

panji530 opened this issue 6 years ago · comments

Hi,

Thanks for sharing your implementation!
I was wondering if you know how to solve for the translation vector (from object center to camera center) after predicting object dimensions and yaw angles. In your current visualization, the groundtruth translation vectors are used; but these should also be solved as described in the paper.

Best,
Pan

Xinshuo Weng · Answer 1 · Wed Feb 13 2019 02:38:06 GMT+0800 (China Standard Time)

same question here.

MengAjin · Answer 2 · Wed Apr 10 2019 21:40:07 GMT+0800 (China Standard Time)

same question here.

Linye Li · Answer 3 · Mon Oct 28 2019 18:04:15 GMT+0800 (China Standard Time)

Use projective property.

Yan Lu · Answer 4 · Tue Feb 11 2020 08:10:16 GMT+0800 (China Standard Time)

Hi, I have found the answer from the supplementary. The main logic is :

At the beginning of the section 3. The author gives four constraints of the 2d bboxes and the 3d bboxes (Eq 1 and Eq 2). But there are 9 parameters here, so we can not use these four constraints to solve.
So the main idea is to reduce the degrees of freedom.
The 3.1, 3.2 reduce the degrees to 7 (yaw angle, box size and box center location)
Train a CNN to estimate the yaw angle and the box size. After that, we only have 3 degrees remained.
So in the inference, using the CNN to estimate the yaw angle and the box size. And then, use the 4 constrains equation to solve the remain 3 parameters (location x,y,z, described in the supplementary)