do we have a trafo_world_to_cam for each image or it remains fixed?

Question

do we have a trafo_world_to_cam for each image or it remains fixed?

Rajat-Mehta opened this issue 3 years ago · comments

I see in your jupyter noteback that you have a trafo_world_to_cam file with same name as the input image. Does that mean we have a different transformation file for each image or we can reuse the same file (obviously taken from same camera under same settings) ?

Thanks in advance.

Mario Theers · Answer 1 · Wed Mar 31 2021 19:47:44 GMT+0800 (China Standard Time)

Hi @Rajat-Mehta ! I do not know which Jupyter Notebook you mean, but I guess I can answer your question anyway.

In general, different images are captured from different positions (and orientations). This is because the camera is attached to a vehicle that is moving. Hence, there is a different transformation "world_to_cam" for each image. The camera is moving with respect to the stationary world coordinate system.

TLDR: You cannot reuse the transformation files. Each image needs its own transformation matrix.

Rajat Mehta · Answer 2 · Wed Mar 31 2021 20:34:17 GMT+0800 (China Standard Time)

@thomasfermi Thanks for your quick response. Does that mean, we will need a new "trafo_world_to_cam" file corresponding to each image frame in order to convert the lane detection results from pixel to world (or road) coordinate system.

If yes, how is that going to work when we deploy this model in a real car?

Mario Theers · Answer 3 · Thu Apr 01 2021 02:08:09 GMT+0800 (China Standard Time)

Hi @Rajat-Mehta! I think you mixed up some parts of the book.

The trafo_world_to_cam matrices are only needed in the part of the book where we create training data for the neural network. We get the lane boundaries in the world frame from Carla's API. We also get the trafo_world_to_cam from Carla. We combine this to get the lane boundaries in the camera reference frame. Then we project those lane boundaries into a "label image". This is used to train the neural net. In the real world, you could get these label images by manually drawing lane lines into your image.

Now to this point:

... convert the lane detection results from pixel to world (or road) coordinate system.

There is a big difference between what I called the "road frame" and what I called the "world frame" in the book. Please have a look at this image from the book again. The world frame is stationary (or fixed to planet earth if you will). The road frame however is fixed to the car. It is moving with the car. Also the camera frame is fixed to the car. Hence, the trafo_road_to_cam (and its inverse trafo_cam_to_road) is constant. You will see this if you do the exercises on lane detection where you implement all the #TODOs in camera_geometry.py. One #TODO asks you to set the proper value for self.trafo_cam_to_road. This matrix is constant.