thomasfermi / Algorithms-for-Automated-Driving

Each chapter of this (mini-)book guides you in programming one important software component for automated driving.

Home Page:https://thomasfermi.github.io/Algorithms-for-Automated-Driving/Introduction/intro.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Doubt about the XYZ road coordinates and projecting BEV's image.

tldrafael opened this issue · comments

First off, thank you for this incredible work! It's been very insightful for my studies!

But, I'm facing a problem when I try to extract the XYZ road coordinates from the UV-grid. I'm currently trying to create Bird's Eye View (BEV) projection from CARLA's frame using the camera's intrinsic and extrinsic parameters. I'm computing the XYZ world/road coordinates based on your work camera_calibration.py and lidar_to_camera.py tutorial, such that I can generate the BEV image later.

However, after I compute the XYZ road coordinates from (u, v), I get a lot of negative values on the x-axis, which I assume it does not make sense as the car would be the depth reference of 0, and any (u,v) of the image would be greater than zero. I also noticed the results over the x-axis are symmetrical around zero and around the v-axis of the image (see the image at the end of the issue).

I have been struggling some days with this problem. I would appreciate a ton any help or hint if possible. Thank you very much.

A reproducible code from what I am facing:

import sys
sys.path.append('Algorithms-for-Automated-Driving/code/solutions/lane_detection/')
from camera_geometry import CameraGeometry

import carla
import random
import queue
import matplotlib.pyplot as plt
from skimage import io
import numpy as np


def to_bgra_array(image):
    """Convert a CARLA raw image to a BGRA numpy array."""
    array = np.frombuffer(image.raw_data, dtype=np.dtype("uint8"))
    array = np.reshape(array, (image.height, image.width, 4))
    return array


def to_rgb_array(image):
    """Convert a CARLA raw image to a RGB numpy array."""
    array = to_bgra_array(image)
    # Convert BGRA to RGB.
    array = array[:, :, :3]
    array = array[:, :, ::-1]
    return array

client = carla.Client('localhost', 2000)
client.set_timeout(110.0)
world = client.get_world()
world_map = world.get_map()

blueprint_library = world.get_blueprint_library()
# Choose a random vehicle and spawn point
bp = random.choice(blueprint_library.filter('vehicle'))
transform = random.choice(world.get_map().get_spawn_points()) 
waypoints = world_map.get_waypoint(transform.location)
ego_vehicle = world.spawn_actor(bp, transform) 
ego_vehicle.set_autopilot(True)

camera_bp = blueprint_library.find('sensor.camera.rgb')
camera_transform = carla.Transform(carla.Location(x=1.5, z=2.4))
camera = world.spawn_actor(camera_bp, camera_transform, attach_to=ego_vehicle)
image_queue = queue.Queue()
camera.listen(image_queue.put)

for _ in range(1):
    waypoint = random.choice(waypoints.next(1.5))
    ego_vehicle.set_transform(waypoints.transform)
    world.tick(1)
    image = image_queue.get()

Get the camera's parameters and generate XYZ coords from uv-grid.

cam_T = camera.get_transform()
cam_height = cam_T.location.z

image_w = camera_bp.get_attribute("image_size_x").as_int()
image_h = camera_bp.get_attribute("image_size_y").as_int()
fov = camera_bp.get_attribute("fov").as_float()
focal = image_w / (2 * np.tan(fov * np.pi / 360))

cg = CameraGeometry(image_width=image_w, image_height=image_h, field_of_view_deg=fov, pitch_deg=cam_T.rotation.pitch,
                                      yaw_deg=cam_T.rotation.yaw, roll_deg=cam_T.rotation.roll, height=cam_T.location.z)

uv_grid = np.meshgrid(np.arange(im_w), np.arange(im_h))
uv_grid = np.dstack(uv_grid).reshape(-1, 2)

XYZ_coords = []
for u, v in uv_grid:
    XYZ_res = cg.uv_to_roadXYZ_roadframe(u, v)
    XYZ_coords.append(XYZ_res)

XYZ_coords = np.stack(XYZ_coords)

Investigating the coordinates "deciles":

for ix in range(XYZ_coords.shape[-1]):
    print(np.quantile(XYZ_coords[:, ix], np.linspace(.01, .99, 11)).round(1))

# [-160.3  -14.8   -7.8   -5.    -2.5    0.     2.5    5.     7.8   14.8   160.3]
# [-0. -0. -0.  0.  0.  0.  0.  0.  0.  0.  0.]
# [-81.1  -7.4  -3.9  -2.5  -1.3  -0.    1.2   2.5   3.9   7.4  79.7]

I plot the depth information below, and it is noticed the symmetry on the x-axis of the plot, and that the maximum values are concentrated at half of the image. Is it an expected behavior?

fig, axs = plt.subplots(2, 3, figsize=(16, 10), squeeze=False)
[a.set_axis_off() for a in axs.ravel()]

# Tight the coordinates values for better visualization
q_margin = 0.05
XYZ_coords_tidy = XYZ_coords.copy()
XYZ_coords_tidy[..., 0] = np.clip(XYZ_coords[..., 0], *np.quantile(XYZ_coords[..., 0], (q_margin, 1 - q_margin)))
XYZ_coords_tidy[..., 2] = np.clip(XYZ_coords[..., 2], *np.quantile(XYZ_coords[..., 2], (q_margin, 1 - q_margin)))

im_cur_depth = XYZ_coords_tidy[..., 0].reshape((im_h, im_w))[..., None].copy()
im_cur_depth = (im_cur_depth - im_cur_depth.min()) / (im_cur_depth.max() - im_cur_depth.min())
im_cur_overlap = im_cur * .5 + im_cur_depth * .5

axs[0, 0].imshow(im_cur)
axs[0, 1].imshow(im_cur_depth, cmap='gray')
axs[0, 2].imshow(im_cur_overlap)

im_cur_yaxis = XYZ_coords_tidy[..., 2].reshape((im_h, im_w))[..., None].copy()
im_cur_yaxis = (im_cur_yaxis - im_cur_yaxis.min()) / (im_cur_yaxis.max() - im_cur_yaxis.min())
im_cur_overlap = im_cur * .5 + im_cur_yaxis * .5

axs[1, 1].imshow(im_cur_yaxis, cmap='gray')
axs[1, 2].imshow(im_cur_overlap)

image

System information:

CARLA - 0.9.13
Ubuntu - 20.04

Hi @tldrafael , from your code it seems like you converted all u,v coordinates to XYZ coordinates with the CameraGeometry object. However this is not intended use of the CameraGeometry object. The conversion to XYZ only works for u,v coordinates that correspond to the road. Maybe read the derivation in the book again to see why.

Thank you for your reply @thomasfermi. Yes, I converted all (u, v) coords to XYZ. I see that the only correct XYZ coords would be the ones that the (u,v) belong to the road because of the assumptions of height=0 and planer road. However, the XYZ values seem awkward. It happens with different camera settings. For example, another image below with the depth information:

image

I also checked your notebook InversePerspectiveMapping.ipynb. If you see the distribution from the y_arr (the depth on the ISO8855. This "symmetry problem" also happens there.