MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
Chongjian GE*,
Junsong Chen*,
Enze Xie+,
Zhongdao Wang,
Lanqing Hong,
Huchuan Lu,
Zhenguo Li,
Ping Luo+
(* denotes equal contribution, + denotes corresponding authors)
Project Page | arXiv | youtube demo
Updates
- (20/04/2023) MetaBEV is released on arxiv.
Abstract
Perception systems in modern autonomous driving vehicles typically take inputs from complementary multi-modal sensors, e.g., LiDAR and cameras. However, in real-world applications, sensor corruptions and failures lead to inferior performances, thus compromising autonomous safety.
In this paper, we propose a robust framework, called MetaBEV, to address extreme real-world environments, involving overall six sensor corruptions and two extreme sensor-missing situations.
Experiments show MetaBEV outperforms prior arts by a large margin on both full and corrupted modalities. For instance, when the LiDAR signal is missing, MetaBEV improves 35.5% detection NDS and 17.7% segmentation mIoU upon the vanilla BEVFusion model; and when the camera signal is absent, MetaBEV still achieves 69.2% NDS and 53.7% mIoU, which is even higher than previous works that perform on full-modalities. Moreover, MetaBEV performs fairly against previous methods in both canonical perception and multi-task learning settings, refreshing state-of-the-art nuScenes BEV map segmentation with 70.4% mIoU.
Results
Our model achieves the following performance on :
1-Single Complementary Modalities.
- Detection on nuScenes val set with LiDAR and Camera.
Methods | Modality | Multi-Task | mAP(val) | NDS(val) |
---|---|---|---|---|
MetaBEV-Transfusion | Camera | x | 49.4 | 49.7 |
MetaBEV-Centerhead | Camera | x | 55.5 | 60.4 |
MetaBEV-Transfusion | LiDAR | x | 62.5 | 68.6 |
MetaBEV-Centerhead | LiDAR | x | 64.2 | 69.3 |
MetaBEV-Transfusion | Camera+LiDAR | x | 68 | 71.5 |
MetaBEV-Transfusion | Camera+LiDAR | √ | 65.4 | 69.8 |
- Segmentation on nuScenes val set with LiDAR and Camera.
Methods | Modality | Drivable | Ped.Cross | Walkway | Stop Line | Carpark | Divider | Mean |
---|---|---|---|---|---|---|---|---|
MetaBEV | Camera | 83.3 | 56.7 | 61.4 | 50.8 | 55.5 | 48 | 59.3 |
MetaBEV | LiDAR | 87.9 | 63.4 | 71.6 | 55 | 55.1 | 55.7 | 64.8 |
MetaBEV | Camera+LiDAR | 89.6 | 68.4 | 74.8 | 63.3 | 64.4 | 61.8 | 70.4 |
MetaBEV | Camera+LiDAR | 88.5 | 64.9 | 71.8 | 56.7 | 61.1 | 58.2 | 66.9 |
2-Missing Modalities.
Methods | Camera+LiDAR | Missing Camera | Missing LiDAR | ||||||
mAP | NDS | mIoU | mAP | NDS | mIoU | mAP | NDS | mIoU | |
MetaBEV | 68.0 | 71.5 | 70.4 | 63.6 | 69.2 | 53.7 | 39.0 | 42.6 | 54.4 |
3-Corrupted Modalities.
Acknowledgements
The project is based on mmdetection3d, BEVFusion, robust benchmark. Thanks for their awesome works.
License
This project is under the MIT license. See LICENSE for details.
Citation
If you find MetaBEV useful or relevant in your research please consider citing our paper:
@article{ge2023metabev,
title={MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation},
author={Ge, Chongjian and Chen, Junsong and Xie, Enze and Wang, Zhongdao and Hong, Lanqing and Lu, Huchuan and Li, Zhenguo and Luo, Ping},
journal={arXiv preprint arXiv:2304.09801},
year={2023}
}