Camera calibration and convert between camera coords and real world coords with realsense camera

Question

Camera calibration and convert between camera coords and real world coords with realsense camera

LeVHoangduc opened this issue 2 months ago · comments

Required Info
Camera Model	{ L515}
Firmware Version	(Open RealSense Viewer --> Click info)
Operating System & Version	Linux (Ubuntu 20.04)
Kernel Version (Linux Only)	5.15
Platform	PC
SDK Version	}
Language	python }
Segment	{Robot }

Issue Description

Currently, I'm trying to dive understand related to the camera coordinates system to real word camera system. And this is my detail problem. Hope I could have the more information from guys.

Theory:
I follow this pinhole model:

According to the equation as below:

I use the equation as below for the coords transformation :

Hence, this is my detail what I do:

import pyrealsense2 as rs

pipeline = rs.pipeline()
pipe_profile = pipeline.start()
frames = pipeline.wait_for_frames()
depth_frame = frames.get_depth_frame()
color_frame = frames.get_color_frame()

depth_intrin = depth_frame.profile.as_video_stream_profile().intrinsics
print("depth intrinsics:\n",depth_intrin)
color_intrin = color_frame.profile.as_video_stream_profile().intrinsics
print("color intrinsics:\n",color_intrin)
depth_to_color_extrin = depth_frame.profile.get_extrinsics_to(color_frame.profile)
print("Extrin Camera:\n",depth_to_color_extrin)

I obtain extrinsic matrix & intrinsic matrix camera based on the code as above.

Then, I'm trying to convert to the real word as above:
with tvec_targer2cam , r_target2cam from extrinsic matrix camera
and cameraMatrix_realsense from intrinsic matrix

cameraMatrix_realsense = np.array([[901.353, 0.0, 654.587],
                                   [0.0, 901.834, 351.407],
                                   [0.0, 0.0, 1.0]])
tvec_targer2cam = tvec_targer2cam.reshape(3,1)

r_target2cam_inv = np.linalg.inv(r_target2cam)

# Example point in camera frame
u =770
v=440
scale=1

suv = scale*np.array([u,v,1]).reshape(3,1)

# ([Xi,Yi])/scale = Min x [Xc,Yc,Zc]
point_cam_coords = np.dot(np.linalg.inv(cameraMatrix_realsense),suv)
# Pinhole model [Xc,Yc,Zc] = R[Xw,Yw,Zw] + t

point_real_coords = point_cam_coords - tvec_targer2cam

point_real_coords = np.dot(r_target2cam_inv,point_real_coords)
print("point_real_coords:\n",point_real_coords)

So, I test with u,v I obtain when I plot the image with plt() and move the mouse to get the pixel u,v.
The output of point_real_coords is:

x =0.12844626
y =0.10895989
z =1.00277352

For the information provided as above. What's the unit of x,y,z output?

MartyG-RealSense · Answer 1 · Wed May 29 2024 17:58:59 GMT+0800 (China Standard Time)

Hi @LeVHoangduc The L515 uses the coordinate system of the D435i camera model that is described at the link below.

https://github.com/IntelRealSense/librealsense/blob/master/doc/d435i.md

In this system, X is horizontal, Y is vertical and Z is depth (pointing forwards from the front of the camera). The distance unit of the real-world values is meters.

MartyG-RealSense · Answer 2 · Thu Jun 06 2024 17:54:39 GMT+0800 (China Standard Time)

Hi @LeVHoangduc Do you require further assistance with this case, please? Thanks!

MartyG-RealSense · Answer 3 · Fri Jun 14 2024 15:25:42 GMT+0800 (China Standard Time)

Case closed due to no further comments received.