train nn for trajectory planning

Question

train nn for trajectory planning

Armandpl opened this issue a year ago · comments

Armand du Parc Locmaria commented a year ago

setup new model
- speed should be an input to the last layer
- predicted traj should include x,y coords as well as speed.
  - think about how to normalize. rl script should output real values I think, e.g deg for steering and m/s. based on the model
- add tanh or sigmoid or smth to help
write code to load the trajectories
write code to project trajectories on the images
add augmentations, start with basic ones
investigate other models. e.g efficientnet
- maybe there are lighter and faster models
- probably stick with torchvision, no need for smth too complicated
after setting up the new cam position and choosing a crop, record video on the local track
- use nvargus, probably easier and cleaner than python script
- this will be our test set to viz the model prediction
at the end (or even during?) of training, plot model predictions on this video
- choose where to store camera params, maybe this should be the output of the blender script
  - maybe the blender script should output those where it outputs the images

Armand du Parc Locmaria · Answer 1 · Sun May 14 2023 06:18:42 GMT+0800 (China Standard Time)

fine tune stable diffusion to make sim images look like real images using control net
- add augs in blender
- upload dataset on wandb for reproducibility
- add classical augs
- train one nn with/without SD augs/maybe one with only augmented images
for each trajectory, add small rotation + offset to the camera
check if there are other NNs that could work on the jetson
check if we could add a GRU and still get decent speed
make the max speed of the car in sim a speed we can actually safely use for testing in real life
add tanh to the output of the neural net

Armand du Parc Locmaria · Answer 2 · Sun May 14 2023 21:14:32 GMT+0800 (China Standard Time)

~~re-project real images to be as if the camera is in the new position to get a test set~~ just match sim

Armand du Parc Locmaria · Answer 3 · Mon May 15 2023 01:03:49 GMT+0800 (China Standard Time)

trajectories depend on car speed which the nn can't figure out from a single image. We could add a GRU or feed two frames to the model.
A less costly approach (in terms of compute) would be to feed the speed at t-1 in sim and the measured speed in real life to the last FC layer of the network

We could then viz predicted trajectories for a range of speeds and compare losses with and without the speed info

Armand du Parc Locmaria · Answer 4 · Mon May 15 2023 18:13:21 GMT+0800 (China Standard Time)

put model back on cpu after training, see if it fixes tensor rt not knowing about mps

Armand du Parc Locmaria · Answer 5 · Sun May 21 2023 21:51:00 GMT+0800 (China Standard Time)

Act as an excellent engineer, the type that can write haskell and cuda kernels but also Python, the type that manages to write clear, readable code and communicate about it. Also never forget I believe in you <3.

I need you to write me a torch.util.data.Dataset class to load my custom dataset.

The dataset is a set of images stored in root_dir/images, they are number from 0000.png to 10000.png when len(dataset) == 10_000.
These images are accompanied by labels stored in root_dir/rl_trajectories.txt which is a numpy array of shape (10000, 7). Each row is (car.pos_x, car.pos_y, car.yaw, steering_command, speed_command, end_of_sequence). The car position is stored in a global frame
For each image I'd like to know the future trajectory (x, y positions) for the N next steps, in the car frame
- this mean you'll need to split the trajectory list using the end_of_sequence boolean flag, to make sure you don't return the trajectory coordinates of the next sequence
- you will also need to rotate the trajectory coordinates using the car yaw at the current step

Please ask any clarifying question before generating the code

Armand du Parc Locmaria · Answer 6 · Sun May 21 2023 23:31:19 GMT+0800 (China Standard Time)

I need you to write code to project a trajectory in 3d space (a set of points) relative to a camera onto the image. Use the following function to project the points:

def project_points(point_3d: torch.Tensor, camera_matrix: torch.Tensor) -> torch.Tensor:
    r"""Project a 3d point onto the 2d camera plane.

    Args:
        point3d: tensor containing the 3d points to be projected
            to the camera plane. The shape of the tensor can be :math:`(*, 3)`.
        camera_matrix: tensor containing the intrinsics camera
            matrix. The tensor shape must be :math:`(*, 3, 3)`.

    Returns:
        tensor of (u, v) cam coordinates with shape :math:`(*, 2)`.

    Example:
        >>> _ = torch.manual_seed(0)
        >>> X = torch.rand(1, 3)
        >>> K = torch.eye(3)[None]
        >>> project_points(X, K)
        tensor([[5.6088, 8.6827]])
    """
    # projection eq. [u, v, w]' = K * [x y z 1]'
    # u = fx * X / Z + cx
    # v = fy * Y / Z + cy
    # project back using depth dividing in a safe way
    xy_coords: torch.Tensor = convert_points_from_homogeneous(point_3d)
    return denormalize_points_with_intrinsics(xy_coords, camera_matrix)

Please ask any clarifying question then finish writing the following code:

# plot trajectory on the images
cam_offset = (0.105, 0, 0.170) # x, y, z in meter from center of mass. x is forward
cam_rotation = (0, 12, 0) # roll, pitch, yaw in deg
cam_focal = 0.87 # mm

sample = ds[0]
img = ds[0]["image"] # torch tensor
traj = ds[0]["trajectory"] # (n, 2) x, y coordinates relative to car center of mass

# plot the trajectory on the image

# 1. transform the points from the car frame to the camera frame using cam_offset and cam_rotation
# WRITE CODE here

# 2. project the 3d points onto the image and plot them
# WRITE CODE here

Armand du Parc Locmaria · Answer 7 · Thu May 25 2023 17:27:13 GMT+0800 (China Standard Time)

would be nice to do a sweep to benchmark inference speed of different models, to see what we could use beyond resnet18

looks like resnet18 is still a good choice, though:

i should try the other networks available in torchvision 0.11
maybe i don't need them to be implemented in the torchvision version I use? just export them to onnx then convert to trt?

Armand du Parc Locmaria · Answer 8 · Fri May 26 2023 18:12:49 GMT+0800 (China Standard Time)

Ok so traj prediction in sim seems alright but seems bad on real images. I think one issue is I didn't sample enough "recovery trajectories", trajectories that go from a bad state to the optimal trajectory. One reason for that is that I terminate the episode if the car has even one wheel outside the track, making it impossible to recover from harder cases as it would require lightly crossing the lines. However, I can't allow wheels outside the track as is because sometimes there are obstacles outside the track.

configure env with hydra
- framerate, track files
save model to wandb
add a train.yaml w/ few params
allow n wheels outside the track
- still enforce center of mass inside
- add optional obstacle file, render the obstacles
- add obstacle lidar
- add max distance to lidar, maybe normalize??
we need the obstacles to be visible in blender:
- ~~add cones, from arc centers when parsing dxf~~ did it manually for now, could parse it from obstacles mayyybe?
- add augmentation: hide cones sometimes?
allow fixed speed
modify gen_traj to load model from artifact, instantiate from logged config and log traj to wandb
- also modify run_model to do the same. override render_mode

Armand du Parc Locmaria · Answer 9 · Fri May 26 2023 23:34:59 GMT+0800 (China Standard Time)

Act as an excellent engineer and never forget I believe in you.

I am writing a Gym wrapper for my Reinforcement Learning env. Here is a draft of the code:

class RescaleWrapper(gym.Wrapper):
    """Rescale observation and action space between -1 and 1"""

    def __init__(self, env: gym.Env):
        super().__init__(env)

        self.observation_space = Box(low=np.zeros_like(self.env.observation_space.low) - 1, high=np.zeros_like(self.env.observation_space.low) + 1)
        self.action_space = Box(low=np.zeros_like(self.env.action_space.low) - 1, high=np.zeros_like(self.env.action_space.low) + 1)

    def step(self, action):
        # Clip and rescale action, using self.env.action_space.low/high
        # CODE HERE
        obs, reward, terminated, truncated, info = self.env.step(action)
        # Clip and rescale observation, using self.env.observation_space.how/high
        # CODE HERE
        return obs, reward, terminated, truncated, info

I want it to clip and rescale the observations and actions.
e.g if obs = [1, 15] and min obs = [0, 0] max obs = [1, 10], obs should be rescaled to [1, 1]. Please ask any clarifying questions and finish writing the code (replace # CODE HERE)

Armand du Parc Locmaria · Answer 10 · Mon Jun 12 2023 06:14:43 GMT+0800 (China Standard Time)

warping the traj in the viz to account for the fisheye shouldn't be too hard and possibly make the viz better, do it
don't mirror the traj for viz, do the correct reference frame change for the projection