Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing X and Y Coordinate Offsets in AprilTag Detection with Intel RealSense #12952

Closed
iamstrang3 opened this issue May 21, 2024 · 14 comments
Closed

Comments

@iamstrang3
Copy link


Required Info
Camera Model Intel RealSense D435
Firmware Version 5.16.0.1
Operating System & Version Win 11
Platform PC
SDK Version v2.55.1
Language python
Segment WidowX 250s Arm

Issue Description

I am using an Intel RealSense D435 series camera with Python to detect AprilTags and calculate their 3D coordinates. I am experiencing consistent offsets in the X and Y coordinates of the detected tags, which results in inaccuracies in the computed 3D positions. I am aligning the depth and color streams. This issue is affecting the precision of my robotic arm's movements, which rely on these coordinates. I have reviewed the SDK examples and documentation, but I have not found a clear solution to this problem.
Thank you!
Below is the code I am using:

import cv2
import numpy as np
import pyrealsense2 as rs
from pupil_apriltags import Detector
import time

Configure depth and color streams

pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)

Start the camera

pipeline.start(config)

Initialize the AprilTag detector

detector = Detector(families='tag36h11', nthreads=4, quad_decimate=1.0,
quad_sigma=0.0, refine_edges=1, decode_sharpening=0.25, debug=0)

Create an align object

align = rs.align(rs.stream.color)

Get the intrinsic parameters of the aligned color camera

profile = pipeline.get_active_profile()
color_profile = rs.video_stream_profile(profile.get_stream(rs.stream.color))
intrinsics = color_profile.get_intrinsics()

Define init_pos correctly as a 1D array

init_pos = np.array([2.88055450e-01, 1.34160019e-18, 8.83400061e-02])

print("Waiting for 5 seconds to stabilize...")
time.sleep(5)
print("Starting frame processing...")

try:
while True:
frames = pipeline.wait_for_frames()
aligned_frames = align.process(frames)
depth_frame = aligned_frames.get_depth_frame()
color_frame = aligned_frames.get_color_frame()

    if not depth_frame or not color_frame:
        continue

    color_image = np.asanyarray(color_frame.get_data())
    gray_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2GRAY)
    tags = detector.detect(gray_image)

    for tag in tags:
        cv2.polylines(
            color_image, [tag.corners.astype(int)], True, (0, 255, 0), 2)
        cv2.circle(color_image, tuple(int(i)
                   for i in tag.center), 5, (0, 0, 255), -1)

        camera_z = depth_frame.get_distance(
            int(tag.center[0]), int(tag.center[1]))

        if camera_z > 0.15 and camera_z < 1:
            arm_x = camera_z
            arm_y = -((camera_z / intrinsics.fx) *
                      (tag.center[0] - intrinsics.ppx))
            arm_z = -((camera_z / intrinsics.fy) *
                      (tag.center[1] - intrinsics.ppy))
            camera_position = np.array([arm_x, arm_y, arm_z])

            goal_x = init_pos[0] + arm_x - 0.025
            goal_y = init_pos[1] + arm_y + 0.04
            goal_z = init_pos[2] + arm_z - 0.005

            print(f'Pixel: {int(tag.center[0]), int(tag.center[1])}')
            print(f'Position: {goal_x, goal_y, goal_z}')

            with open("C:/Abhishek/Thesis/coordinates.txt", 'w') as file:
                file.write(f"{goal_x}\n")
                file.write(f"{goal_y}\n")
                file.write(f"{goal_z}\n")

            cv2.putText(color_image, f"3D coordinates: X={goal_x:.4f}, Y={goal_y:.4f}, Z={goal_z:.4f}", (25, 25),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
            cv2.putText(color_image, f'Pixel: {int(tag.center[0]), int(tag.center[1]), camera_z}', (
                25, 50), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)

    cv2.imshow('AprilTag Detection with Depth', color_image)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

finally:
pipeline.stop()
cv2.destroyAllWindows()

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 22, 2024

Hi @iamstrang3 If you have access to the RealSense Viewer tool then you could first try resetting the camera to its factory-new default calibration settings using the instructions at #10182 (comment) in order to eliminate the possibility of a mis-calibration in your camera's sensors.

Mis-calibration can occur if the camera receives a physical shock such as a hard knock, a drop on the ground or severe vibration.

@iamstrang3
Copy link
Author

Thank you for the response. I did the steps exactly as mentioned. But I still don't get the right coordinates.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 23, 2024

If the camera's projector is enabled then its IR emitter (which projects a pattern of dots onto the scene) may make fiducial image markers like apriltags harder for the camera to read by overlaying dots on the tag image. So please try making sure that the emitter is disabled by using the Python code at #10065 (comment) in order to eliminate the dot pattern as a potential cause of inaccuracy.

@MartyG-RealSense
Copy link
Collaborator

Hi @iamstrang3 Do you require further assistance with this case, please? Thanks!

@iamstrang3
Copy link
Author

Hi, sorry for the late reply. I tried to add those lines in my code but it did not help. I am getting the right coordinates at a certain distance then the error keeps increasing with distance.

@MartyG-RealSense
Copy link
Collaborator

At what distance does the increase in error start becoming noticeable? It is normal for error to be near zero at the camera and increase linearly as the distance of an observed object / surface from the camera increases. This phenomenon is called RMS Error and becomes noticeable on the D435i model beyond 3 meters distance from the camera.

@iamstrang3
Copy link
Author

iamstrang3 commented Jun 1, 2024

I have the D435 model and the error is seen within a meter distance from the camera. Also the distance is almost accurate but the problem is with the conversion from pixel to 3d coordinates.

@MartyG-RealSense
Copy link
Collaborator

Is it possible to make the AprilTags larger to test whether moving the camera further away makes the tags more difficult for the camera to read?

@iamstrang3
Copy link
Author

The camera read the april tag just fine at larger distances.

@MartyG-RealSense
Copy link
Collaborator

I note that you are using 640x480 resolution. As resolution is reduced, depth accuracy also reduces. For the D435 camera model, 848x480 is the optimal resolution for depth accuracy.

Does accuracy improve at distance if you use 848x480 and 30 FPS for both depth and color streams if possible? This configuration will only be supported on a USB 3.2 connection, whilst a USB 2.1 connection is only able to use 640x480 at 30 FPS.

@iamstrang3
Copy link
Author

I get the same coordinates after changing the resolution too. Difference was in 1 or 2 mm.

@MartyG-RealSense
Copy link
Collaborator

Is there any improvement when observing the tags at 1 meter range if you maximize the Sharpness of the color image? Python code for doing so can be found at #11912 (comment)

@MartyG-RealSense
Copy link
Collaborator

Hi @iamstrang3 Do you require further assistance with this case, please? Thanks!

@MartyG-RealSense
Copy link
Collaborator

Case closed due to no further comments received.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants