Too much fluctuation while estimating pose of camera, using Apriltag and inverse of 3 x 3 Rotation pose matrix

Question

Too much fluctuation while estimating pose of camera, using Apriltag and inverse of 3 x 3 Rotation pose matrix

maharshi114 opened this issue 3 years ago · comments

First of all, I am explaining, what I am doing,

I am using apriltag c++ python wrapper to estimate the pose of the TAG. The outputs are Pose 3 x 3 Rotation matrix and 4 x 1 Translation matrix.

Now from this, I am doing the inverse of 3 x 3 Rotation matrix and calculating the new camera pose by the following equation.

Cam_X = (R[0][0]*Tag_x) + (R[0][1]*Tag_y) + (R[0][2]*Tag_z)
Cam_Y = (R[1][0]*Tag_x) + (R[1][1]*Tag_y) + (R[1][2]*Tag_z)
Cam_Z = (R[2][0]*Tag_x) + (R[2][1]*Tag_y) + (R[2][2]*Tag_z)
The camera is well-calibrated and the actual accuracy of Tag coordinates is under 5% of error. But the result of Cam_X, Cam_Y, Cam_Z is too much fluctuating, at the same time tag is on the ceiling, so my Apriltag is super stable(0 DOF, rigid body).

Can someone help, to solve this basic problem, Thank you appreciate your efforts.

reference of the above equation:[1] [1]: https://github.com/ColumnRobotics/column/blob/master/src/rectified_april_tag.cpp

Maharshi Oza · Answer 1 · Tue Jan 12 2021 17:14:30 GMT+0800 (China Standard Time)

Also, I have analyzed the behavior of Yaw, Pitch, and Roll.

Tag_Yaw is stable, fluctuating under 0.1 Degree, but Pitch and Roll Fluctuating around 4 to 5 Degrees.

So how can I improve the stability of Apriltag parameters? By using higher resolution? or higher FOV? or higher FPS?

Thank you appreciate your efforts.

Max Krogius · Answer 2 · Tue Jan 12 2021 23:04:16 GMT+0800 (China Standard Time)

The code you linked looks like it is using the c++ apriltags python library written by Michael Kaess. That is a different AprilTag library than the one in this repo and I will not offer support for issues with someone else's library

Maharshi Oza · Answer 3 · Tue Jan 12 2021 23:55:59 GMT+0800 (China Standard Time)

thank you for your response,
I have attached the link is only for reference, I am using your library and python wrapper.

Max Krogius · Answer 4 · Wed Jan 13 2021 00:12:00 GMT+0800 (China Standard Time)

What method are you using to calculate the tag pose?

Maharshi Oza · Answer 5 · Wed Jan 13 2021 00:36:29 GMT+0800 (China Standard Time)

I am doing some simple mathematical calculations here to find Yaw, pitch, roll, and X, Y, Z.

pose_r is rotation matrix(array) of 3x3 elements , and pose_t translation matrix(array) of 3x1 elements.

pose_r = detected_parameters.pose_R
pose_T = detected_parameters.pose_t
r11 = pose_r[0][0]
r12 = pose_r[0][1]
r13 = pose_r[0][2]
r21 = pose_r[1][0]
r22 = pose_r[1][1]
r23 = pose_r[1][2]
r31 = pose_r[2][0]
r32 = pose_r[2][1]
r33 = pose_r[2][2]

AprilTagPitch = round(degrees(atan(-r31/sqrt((r32r32)+(r33r33)))),3)
AprilTagRoll = round(degrees(atan(-r32/r33)),3)
ApriTagYaw = round(degrees(atan(r21/r11)),3)
AprilTagX = pose_T[0][0]
AprilTagY = pose_T[1][0]
AprilTagZ = pose_T[2][0]

Mathematical calculation reference: http://planning.cs.uiuc.edu/node103.html

Max Krogius · Answer 6 · Wed Jan 13 2021 01:07:08 GMT+0800 (China Standard Time)

What I meant was, where do you get detected_parameters.pose_R and detected_parameters.pose_t from? After all, any uncertainty in your final result will stem solely from your uncertainty in these values.

I should also point out that by computing the inverse here, you are probably making the noise a lot worse, because your camera position and rotation is now dependent on the estimate of the apriltag's rotational pose instead of just its position. Is there some other way you can formulate your problem? If the goal is to localize your camera in the scene, then having multiple apriltags, widely spaced apart, could help reduce the noise.

Maharshi Oza · Answer 7 · Wed Jan 13 2021 01:19:34 GMT+0800 (China Standard Time)

using this python wrapper I can find detected_parameters.pose_R and detected_parameters.pose_t.

Link: https://github.com/duckietown/apriltags3-py

Maharshi Oza · Answer 8 · Wed Jan 13 2021 01:34:12 GMT+0800 (China Standard Time)

but actually, we tested the output of X, Y, Z, and Yaw. It is accurate under a certain limit with original ground truth. So from these testing, we are assuming that our camera calibration process and other things are set in the right manner.

Maharshi Oza · Answer 9 · Wed Jan 13 2021 01:39:54 GMT+0800 (China Standard Time)

Also, we have a limitation to setup only one Apriltag to localize our robotic platform.

Max Krogius · Answer 10 · Wed Jan 13 2021 04:55:54 GMT+0800 (China Standard Time)

Ah ok, I think that should be pretty close to the same code as is in this repo.

Here are a few things you could try, other than adding more tags to the scene:

Increase the size of the tag.
Increase camera resolution.
Make sure the tag is in focus, blurry edges are bad.
Add a second camera.
Try using OpenCv's PnP solver with SOLVEPNP_IPPE_SQUARE, to solve for the pose instead of the built-in solver, it may be slightly more accurate.

Maharshi Oza · Answer 11 · Wed Jan 13 2021 13:35:37 GMT+0800 (China Standard Time)

I have read about solvePnP and splvePnPRansac in OpenCV. The thing is I can't understand the following arguments,

rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)

I just want to know the value of objp, and Corners2, or where I can find the values from the wrapper?

mtx is a Camera calibration matrix, contains Fx, Fy, Cx, and Cy.
dist is a Camera distortion.

Max Krogius · Answer 12 · Thu Jan 14 2021 00:23:34 GMT+0800 (China Standard Time)

You want to use solvePnP, not solvePnP ransac, since you know there won't be any outlier points.

Here's a link to the documentation that explains (for c++ but python args are similar)

The first arg is the 3d location of the tag corners when the tag is at the origin: [[-tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, tagsize/ 2, 0]. [ tagsize/ 2, -tagsize/ 2, 0], [-tagsize/ 2, -tagsize/ 2, 0]]. The second arg is the 2d location of these same corners in your image, the AprilTag detector tells you these values.

n-gineer · Answer 13 · Thu Jan 21 2021 12:43:05 GMT+0800 (China Standard Time)

[[-tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, tagsize/ 2, 0]. [ tagsize/ 2, -tagsize/ 2, 0], [-tagsize/ 2, -tagsize/ 2, 0]]. The second arg is the 2d location of these same corners in your image, the AprilTag detector tells you these values.

there is a typo here: after 2nd corner location there is a period when you meant to include a comma. It should be:
[[-tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, -tagsize/ 2, 0], [-tagsize/ 2, -tagsize/ 2, 0]]
just wanted to share for the copy/pasters our there like me who will inevitably break their code trying to implement.

n-gineer · Answer 14 · Wed Mar 10 2021 23:25:38 GMT+0800 (China Standard Time)

I have read about solvePnP and splvePnPRansac in OpenCV. The thing is I can't understand the following arguments,

rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)

I just want to know the value of objp, and Corners2, or where I can find the values from the wrapper?

mtx is a Camera calibration matrix, contains Fx, Fy, Cx, and Cy.
dist is a Camera distortion #

here's my implementation, if it helps:

corner = tag_size/2
objectPoints= np.array([ [-corner,corner, 0], [ corner, corner, 0], [ corner, -corner, 0], [-corner, -corner, 0] ])
SOLVEPNP_IPPE_SQUARE =7 # (enumeration not working: 
# https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga357634492a94efe8858d0ce1509da869)

for d in detections:
                
                    # print(d['lb-rb-rt-lt'])
                    imagePoints = np.array([d['lb-rb-rt-lt']])
                    # print(imagePoints)
            
                    # solvePnP docs: https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d
                    _ret, rvec, tvec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, useExtrinsicGuess=False, flags=SOLVEPNP_IPPE_SQUARE)
                    # print("rvec:", rvec)
                    print("tvec:", tvec)
                    R, _jacobian = cv2.Rodrigues(rvec)
                    # print("R:", R)
                    yaw = np.arctan2(R[0,2],R[2,2])*180/np.pi # 180//np.pi gets to integers?
                    roll = np.arcsin(-R[1][2])*180/np.pi
                    pitch = np.arctan2(R[1,0],R[1,1])*180/np.pi

Yaw, pitch, and roll are not absolutes, and are dependent on the order in which the rotations occur. For a more thorough explanation of the possible orders and the resulting math to use, see here

Maharshi Oza · Answer 15 · Fri Mar 12 2021 11:00:51 GMT+0800 (China Standard Time)

I have read about solvePnP and splvePnPRansac in OpenCV. The thing is I can't understand the following arguments,
rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)
I just want to know the value of objp, and Corners2, or where I can find the values from the wrapper?
mtx is a Camera calibration matrix, contains Fx, Fy, Cx, and Cy.
dist is a Camera distortion #

here's my implementation, if it helps:
corner = tag_size/2
objectPoints= np.array([ [-corner,corner, 0], [ corner, corner, 0], [ corner, -corner, 0], [-corner, -corner, 0] ])
SOLVEPNP_IPPE_SQUARE =7 # (enumeration not working: 
# https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga357634492a94efe8858d0ce1509da869)

for d in detections:
                
                    # print(d['lb-rb-rt-lt'])
                    imagePoints = np.array([d['lb-rb-rt-lt']])
                    # print(imagePoints)
            
                    # solvePnP docs: https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d
                    _ret, rvec, tvec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, useExtrinsicGuess=False, flags=SOLVEPNP_IPPE_SQUARE)
                    # print("rvec:", rvec)
                    print("tvec:", tvec)
                    R, _jacobian = cv2.Rodrigues(rvec)
                    # print("R:", R)
                    yaw = np.arctan2(R[0,2],R[2,2])*180/np.pi # 180//np.pi gets to integers?
                    roll = np.arcsin(-R[1][2])*180/np.pi
                    pitch = np.arctan2(R[1,0],R[1,1])*180/np.pi
Yaw, pitch, and roll are not absolutes, and are dependent on the order in which the rotations occur. For a more thorough explanation of the possible orders and the resulting math to use, see here

Thank you, I have implemented solvePnP with 1920 x 1080 Resolution, this is a bit fluctuating but moreover is stable( X, Y, Z)

I will post my code soon.