smallcorgi / 3D-Deepbox

3D Bounding Box Estimation Using Deep Learning and Geometry (MultiBin)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How is the translation solved?

panji530 opened this issue · comments

Hi,

Thanks for sharing your implementation!
I was wondering if you know how to solve for the translation vector (from object center to camera center) after predicting object dimensions and yaw angles. In your current visualization, the groundtruth translation vectors are used; but these should also be solved as described in the paper.

Best,
Pan

same question here.

same question here.

Use projective property.

Hi, I have found the answer from the supplementary. The main logic is :

  1. At the beginning of the section 3. The author gives four constraints of the 2d bboxes and the 3d bboxes (Eq 1 and Eq 2). But there are 9 parameters here, so we can not use these four constraints to solve.
  2. So the main idea is to reduce the degrees of freedom.
  3. The 3.1, 3.2 reduce the degrees to 7 (yaw angle, box size and box center location)
  4. Train a CNN to estimate the yaw angle and the box size. After that, we only have 3 degrees remained.
  5. So in the inference, using the CNN to estimate the yaw angle and the box size. And then, use the 4 constrains equation to solve the remain 3 parameters (location x,y,z, described in the supplementary)