hustvl / VAD

[ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving

Home Page:https://arxiv.org/abs/2303.12077

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Difficulty in understanding the visualization script

deo-abhijit opened this issue · comments

Hello,

I am having trouble understanding the visualisation code.

Could anyone please explain what exactly is going on here in simple terms?

https://github.com/hustvl/VAD/blob/main/tools/analysis_tools/visualization.py#L800C1-L834C1

                    plan_traj = np.concatenate((
                        plan_traj[:, [0]],
                        plan_traj[:, [1]],
                        -1.0*np.ones((plan_traj.shape[0], 1)),
                        np.ones((plan_traj.shape[0], 1)),
                    ), axis=1)
                    # add the start point in lcf
                    plan_traj = np.concatenate((np.zeros((1, plan_traj.shape[1])), plan_traj), axis=0)
                    # plan_traj[0, :2] = 2*plan_traj[1, :2] - plan_traj[2, :2]
                    plan_traj[0, 0] = 0.3
                    plan_traj[0, 2] = -1.0
                    plan_traj[0, 3] = 1.0

                    l2e_r = lidar_cs_record['rotation']
                    l2e_t = lidar_cs_record['translation']
                    e2g_r = lidar_pose_record['rotation']
                    e2g_t = lidar_pose_record['translation']
                    l2e_r_mat = Quaternion(l2e_r).rotation_matrix
                    e2g_r_mat = Quaternion(e2g_r).rotation_matrix
                    s2l_r, s2l_t = obtain_sensor2top(nusc, sample_data_token, l2e_t, l2e_r_mat, e2g_t, e2g_r_mat, cam)
                    # obtain lidar to image transformation matrix
                    lidar2cam_r = np.linalg.inv(s2l_r)
                    lidar2cam_t = s2l_t @ lidar2cam_r.T
                    lidar2cam_rt = np.eye(4)
                    lidar2cam_rt[:3, :3] = lidar2cam_r.T
                    lidar2cam_rt[3, :3] = -lidar2cam_t
                    viewpad = np.eye(4)
                    viewpad[:camera_intrinsic.shape[0], :camera_intrinsic.shape[1]] = camera_intrinsic
                    lidar2img_rt = (viewpad @ lidar2cam_rt.T)
                    plan_traj = lidar2img_rt @ plan_traj.T
                    plan_traj = plan_traj[0:2, ...] / np.maximum(
                        plan_traj[2:3, ...], np.ones_like(plan_traj[2:3, ...]) * 1e-5)
                    plan_traj = plan_traj.T
                    plan_traj = np.stack((plan_traj[:-1], plan_traj[1:]), axis=1)