AprilRobotics / apriltag

AprilTag is a visual fiducial system popular for robotics research.

Home Page:https://april.eecs.umich.edu/software/apriltag

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Too much fluctuation while estimating pose of camera, using Apriltag and inverse of 3 x 3 Rotation pose matrix

maharshi114 opened this issue · comments

First of all, I am explaining, what I am doing,

I am using apriltag c++ python wrapper to estimate the pose of the TAG. The outputs are Pose 3 x 3 Rotation matrix and 4 x 1 Translation matrix.

Now from this, I am doing the inverse of 3 x 3 Rotation matrix and calculating the new camera pose by the following equation.

Cam_X = (R[0][0]*Tag_x) + (R[0][1]*Tag_y) + (R[0][2]*Tag_z)
Cam_Y = (R[1][0]*Tag_x) + (R[1][1]*Tag_y) + (R[1][2]*Tag_z)
Cam_Z = (R[2][0]*Tag_x) + (R[2][1]*Tag_y) + (R[2][2]*Tag_z)
The camera is well-calibrated and the actual accuracy of Tag coordinates is under 5% of error. But the result of Cam_X, Cam_Y, Cam_Z is too much fluctuating, at the same time tag is on the ceiling, so my Apriltag is super stable(0 DOF, rigid body).

Can someone help, to solve this basic problem, Thank you appreciate your efforts.

reference of the above equation:[1] [1]: https://github.com/ColumnRobotics/column/blob/master/src/rectified_april_tag.cpp

Also, I have analyzed the behavior of Yaw, Pitch, and Roll.

Tag_Yaw is stable, fluctuating under 0.1 Degree, but Pitch and Roll Fluctuating around 4 to 5 Degrees.

So how can I improve the stability of Apriltag parameters? By using higher resolution? or higher FOV? or higher FPS?

Thank you appreciate your efforts.

The code you linked looks like it is using the c++ apriltags python library written by Michael Kaess. That is a different AprilTag library than the one in this repo and I will not offer support for issues with someone else's library

thank you for your response,
I have attached the link is only for reference, I am using your library and python wrapper.

What method are you using to calculate the tag pose?

I am doing some simple mathematical calculations here to find Yaw, pitch, roll, and X, Y, Z.

pose_r is rotation matrix(array) of 3x3 elements , and pose_t translation matrix(array) of 3x1 elements.

pose_r = detected_parameters.pose_R
pose_T = detected_parameters.pose_t
r11 = pose_r[0][0]
r12 = pose_r[0][1]
r13 = pose_r[0][2]
r21 = pose_r[1][0]
r22 = pose_r[1][1]
r23 = pose_r[1][2]
r31 = pose_r[2][0]
r32 = pose_r[2][1]
r33 = pose_r[2][2]

AprilTagPitch = round(degrees(atan(-r31/sqrt((r32r32)+(r33r33)))),3)
AprilTagRoll = round(degrees(atan(-r32/r33)),3)
ApriTagYaw = round(degrees(atan(r21/r11)),3)
AprilTagX = pose_T[0][0]
AprilTagY = pose_T[1][0]
AprilTagZ = pose_T[2][0]

Mathematical calculation reference: http://planning.cs.uiuc.edu/node103.html

What I meant was, where do you get detected_parameters.pose_R and detected_parameters.pose_t from? After all, any uncertainty in your final result will stem solely from your uncertainty in these values.

I should also point out that by computing the inverse here, you are probably making the noise a lot worse, because your camera position and rotation is now dependent on the estimate of the apriltag's rotational pose instead of just its position. Is there some other way you can formulate your problem? If the goal is to localize your camera in the scene, then having multiple apriltags, widely spaced apart, could help reduce the noise.

using this python wrapper I can find detected_parameters.pose_R and detected_parameters.pose_t.

Link: https://github.com/duckietown/apriltags3-py

but actually, we tested the output of X, Y, Z, and Yaw. It is accurate under a certain limit with original ground truth. So from these testing, we are assuming that our camera calibration process and other things are set in the right manner.

Also, we have a limitation to setup only one Apriltag to localize our robotic platform.

Ah ok, I think that should be pretty close to the same code as is in this repo.

Here are a few things you could try, other than adding more tags to the scene:

  • Increase the size of the tag.
  • Increase camera resolution.
  • Make sure the tag is in focus, blurry edges are bad.
  • Add a second camera.
  • Try using OpenCv's PnP solver with SOLVEPNP_IPPE_SQUARE, to solve for the pose instead of the built-in solver, it may be slightly more accurate.

I have read about solvePnP and splvePnPRansac in OpenCV. The thing is I can't understand the following arguments,

rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)

I just want to know the value of objp, and Corners2, or where I can find the values from the wrapper?

mtx is a Camera calibration matrix, contains Fx, Fy, Cx, and Cy.
dist is a Camera distortion.

You want to use solvePnP, not solvePnP ransac, since you know there won't be any outlier points.

Here's a link to the documentation that explains (for c++ but python args are similar)

The first arg is the 3d location of the tag corners when the tag is at the origin: [[-tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, tagsize/ 2, 0]. [ tagsize/ 2, -tagsize/ 2, 0], [-tagsize/ 2, -tagsize/ 2, 0]]. The second arg is the 2d location of these same corners in your image, the AprilTag detector tells you these values.

[[-tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, tagsize/ 2, 0]. [ tagsize/ 2, -tagsize/ 2, 0], [-tagsize/ 2, -tagsize/ 2, 0]]. The second arg is the 2d location of these same corners in your image, the AprilTag detector tells you these values.

there is a typo here: after 2nd corner location there is a period when you meant to include a comma. It should be:
[[-tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, tagsize/ 2, 0], [ tagsize/ 2, -tagsize/ 2, 0], [-tagsize/ 2, -tagsize/ 2, 0]]
just wanted to share for the copy/pasters our there like me who will inevitably break their code trying to implement.

I have read about solvePnP and splvePnPRansac in OpenCV. The thing is I can't understand the following arguments,

rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)

I just want to know the value of objp, and Corners2, or where I can find the values from the wrapper?

mtx is a Camera calibration matrix, contains Fx, Fy, Cx, and Cy.
dist is a Camera distortion #

here's my implementation, if it helps:

corner = tag_size/2
objectPoints= np.array([ [-corner,corner, 0], [ corner, corner, 0], [ corner, -corner, 0], [-corner, -corner, 0] ])
SOLVEPNP_IPPE_SQUARE =7 # (enumeration not working: 
# https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga357634492a94efe8858d0ce1509da869)

for d in detections:
                
                    # print(d['lb-rb-rt-lt'])
                    imagePoints = np.array([d['lb-rb-rt-lt']])
                    # print(imagePoints)
            
                    # solvePnP docs: https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d
                    _ret, rvec, tvec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, useExtrinsicGuess=False, flags=SOLVEPNP_IPPE_SQUARE)
                    # print("rvec:", rvec)
                    print("tvec:", tvec)
                    R, _jacobian = cv2.Rodrigues(rvec)
                    # print("R:", R)
                    yaw = np.arctan2(R[0,2],R[2,2])*180/np.pi # 180//np.pi gets to integers?
                    roll = np.arcsin(-R[1][2])*180/np.pi
                    pitch = np.arctan2(R[1,0],R[1,1])*180/np.pi

Yaw, pitch, and roll are not absolutes, and are dependent on the order in which the rotations occur. For a more thorough explanation of the possible orders and the resulting math to use, see here

I have read about solvePnP and splvePnPRansac in OpenCV. The thing is I can't understand the following arguments,
rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)
I just want to know the value of objp, and Corners2, or where I can find the values from the wrapper?
mtx is a Camera calibration matrix, contains Fx, Fy, Cx, and Cy.
dist is a Camera distortion #

here's my implementation, if it helps:

corner = tag_size/2
objectPoints= np.array([ [-corner,corner, 0], [ corner, corner, 0], [ corner, -corner, 0], [-corner, -corner, 0] ])
SOLVEPNP_IPPE_SQUARE =7 # (enumeration not working: 
# https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga357634492a94efe8858d0ce1509da869)

for d in detections:
                
                    # print(d['lb-rb-rt-lt'])
                    imagePoints = np.array([d['lb-rb-rt-lt']])
                    # print(imagePoints)
            
                    # solvePnP docs: https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d
                    _ret, rvec, tvec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, useExtrinsicGuess=False, flags=SOLVEPNP_IPPE_SQUARE)
                    # print("rvec:", rvec)
                    print("tvec:", tvec)
                    R, _jacobian = cv2.Rodrigues(rvec)
                    # print("R:", R)
                    yaw = np.arctan2(R[0,2],R[2,2])*180/np.pi # 180//np.pi gets to integers?
                    roll = np.arcsin(-R[1][2])*180/np.pi
                    pitch = np.arctan2(R[1,0],R[1,1])*180/np.pi

Yaw, pitch, and roll are not absolutes, and are dependent on the order in which the rotations occur. For a more thorough explanation of the possible orders and the resulting math to use, see here

Thank you, I have implemented solvePnP with 1920 x 1080 Resolution, this is a bit fluctuating but moreover is stable( X, Y, Z)

I will post my code soon.