Multi modal fusion

Question

Multi modal fusion

Byte247 opened this issue 7 months ago · comments

Is there any example on how to do a multi modal approach in Det3d? Like how to adjust the data loading pipeline and so on? I am mainly using the nuScenes dataset for now.

Tom Sanitz · Answer 1 · Fri Jan 12 2024 20:28:33 GMT+0800 (China Standard Time)

Never mind, found a way to access the camera images by extending the nuscenes.py

Tom Sanitz · Answer 2 · Wed Jan 24 2024 00:25:00 GMT+0800 (China Standard Time)

So other people might struggle less in the future, you need to add your custom entry like "camera_images" to the list in: det3d/torchie/parallel/collate.py.py:
e.g.:
if key in ["voxels", "num_points", "num_gt", "voxel_labels", "num_voxels",
"cyv_voxels", "cyv_num_points", "cyv_num_voxels", "camera_images"]:

Otherwise they will be converted back to a numpy array even if they were torch.Tensors before.