Some issues of specific calculation process in PDA

Question

Some issues of specific calculation process in PDA

shuowang666 opened this issue 2 years ago · comments

An important step in the PDA is the calculation of Tre. I think for the coordinates of the origin camera system before the conversion, first by pose to the world coordinate system, and then according to the pose of the converted camera coordinate system to the coordinates of that coordinate system. But I see that the implementation of “cam_coord_rgb” in your code seems to have some problems, cam_coord is the 3D coordinates under the original camera coordinate system, why should it multiply R2 with R2 first? R2 is a rotation matrix in the transformed camera system, they are not a coordinate system at all. What is the meaning of “cam_coord_rgb”?

Yunhan Zhao · Answer 1 · Fri Sep 23 2022 02:44:20 GMT+0800 (China Standard Time)

Sorry for the confusion. The cam_coord_rgb is computed in the opposite direction compared to the cam_coord_depth. The purpose of this is to use grid_sample function. Essentially, this cam_coord_rgb is a flow-field grid from target to source images before normalizing into [-1, 1]. Please let me know if it is still unclear.

Shuo Wang · Answer 2 · Fri Sep 23 2022 09:43:56 GMT+0800 (China Standard Time)

Sorry, I still don't understand how this works, cam_coord is a 3 dimensional coordinate in the original camera system, my understanding is that only forward calculations can be made, the reverse formula looks incorrect

Shuo Wang · Answer 3 · Fri Sep 23 2022 09:48:05 GMT+0800 (China Standard Time)

The cam_coord represents the 3-dimensional coordinates in the original camera system, but your formula seems to treat the cam_coord as the 3-dimensional coordinates in the converted camera system, which does not seem to be equivalent because the depths are different for the same position u,v in the two camera systems

Yunhan Zhao · Answer 4 · Sat Sep 24 2022 08:29:59 GMT+0800 (China Standard Time)

Yes, pc is the 3 dimensional point cloud in the original camera coordinate. Normally, in order to get the flow-field grid from target back to source images, we need the depth from the target view to warp back which is not directly available. In this case, one potential solution is to compute a forward direction and then compute a backward direction to get the warp field. But it might not be dense enough and it's computational expensive. Alternatively, directly projecting rgb values in the forward direction works fine but you might see some grid artifacts.
In this released implementation, it is a little tricky since we are dealing with small rotation changes and no translations. We find directly using this point cloud and warp back works fine enough. You are welcome to use other approaches mentioned above.

Shuo Wang · Answer 5 · Sat Sep 24 2022 09:40:16 GMT+0800 (China Standard Time)

Thanks a lot.