franter666 / BEVPerception-Survey-Recipe

Awesome BEV perception papers and cookbook for achieving SOTA results

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Awesome BEV perception papers and cookbook for achieving SOTA results

Overview of BEV Perception

The general picture of BEV perception at a glance, where consists of three sub-parts based on the input modality. BEV perception is a general task built on top of a series of fundamental tasks. For better completeness of the whole perception algorithms in autonomous driving, we list other topics as well.

Datasets of BEV Perception

Academic Summary of BEV Perception

Important methods in recent years about BEV perception, including different modalities and tasks.

Important methods performance in recent years about BEV perception, including different settings and leaderboards.

BEV Camera

A general pipeline in BEV Camera

And related literature.

  • Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D / paper / project / ECCV 2020 / LSS
  • BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View / paper / project / arXiv / BEVDet
  • BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection / paper / project / arXiv / BEVDet4D
  • BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection / paper / project / arXiv / BEVDepth
  • DSGN: Deep Stereo Geometry Network for 3D Object Detection / paper / supplemental / project / CVPR 2020
  • LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-Based 3D Detector / paper / supplemental / project / ICCV 2021
  • Is Pseudo-Lidar Needed for Monocular 3D Object Detection? / paper / supplemental / project / ICCV 2021
  • Inverse perspective mapping simplifies optical flow computation and obstacle detection / paper / ? / IPM
  • Deep Learning based Vehicle Position and Orientation Estimation via Inverse Perspective Mapping Image / paper / IV 2019
  • Learning to Map Vehicles into Bird’s Eye View / ICIAP 2017
  • Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras through Homography / paper / IROS 2021
  • Driving Among Flatmobiles: Bird-Eye-View Occupancy Grids From a Monocular Camera for Holistic Trajectory Planning / paper / WACV 2021
  • Understanding Bird’s-Eye View of Road Semantics Using an Onboard Camera / paper / project / IEEE ROBOTICS AND AUTOMATION LETTERS 2022
  • Automatic dense visual semantic mapping from street-level imagery / paper / IEEE/RSJ International Conference on Intelligent Robots and Systems 2012
  • Stacked Homography Transformations for Multi-View Pedestrian Detection / paper / ICCV 2021
  • Cross-View Semantic Segmentation for Sensing Surroundings / paper / project / IEEE Robotics and Automation Letters 2020
  • FISHING Net: Future Inference of Semantic Heatmaps In Grids / paper / arXiv
  • NEAT: Neural Attention Fields for End-to-End Autonomous Driving / paper / project / ICCV 2021
  • Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-View Transformation / paper / project / CVPR 2021
  • Bird’s-Eye-View Panoptic Segmentation Using Monocular Frontal View Images / paper / project / IEEE Robotics and Automation Letters 2022
  • BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers / paper / project / ECCV 2022
  • PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark / paper / project / ECCV 2022
  • PETR: Position Embedding Transformation for Multi-View 3D Object Detection / paper / project / ECCV 2022
  • DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries / paper / project / PMLR 2022
  • Translating Images into Maps / paper / project / ICRA 2022
  • GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation / paper / ECCV 2022
  • PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images / paper / project / arXiv
  • ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection / paper / supplemental / project / WACV 2022
  • MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones / paper / project / arXiv
  • FIERY: Future Instance Prediction in Bird's-Eye View From Surround Monocular Cameras / paper / supplemental / paper / ICCV 2021
  • BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving / paper / project / arXiv


A general pipeline in BEV Camera

And related literature.

  • VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection / paper / supplemental / VoxelNet
  • SECOND: Sparsely Embedded Convolutional Detection / paper / project / Sensors 2018 / SECOND
  • Center-Based 3D Object Detection and Tracking / paper / supplemental / project / CVPR 2021 / CenterPoint
  • PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection / paper / project / CVPR 2020 / PV-RCNN
  • PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection / paper / project / arXiv / PV-RCNN++
  • Structure Aware Single-Stage 3D Object Detection From Point Cloud / paper / project / CVPR 2020 / SA-SSD
  • Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection / paper / project / AAAI 2021 / Voxel R-CNN
  • Object DGCNN: 3D Object Detection using Dynamic Graphs / paper / NeurIPS 2021 / DGCNN
  • Voxel Transformer for 3D Object Detection paper / ICCV 2021 / VoTr
  • Embracing Single Stride 3D Object Detector With Sparse Transformer / paper / supplemental / project / CVPR 2022 / SST
  • AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds / paper / AAAI 2022 / AFDetV2
  • PointPillars: Fast Encoders for Object Detection From Point Clouds / paper / CVPR 2019 / PointPillars

BEV Fusion

BEV Fusion related literature

  • Unifying Voxel-based Representation with Transformer for 3D Object Detection / paper / project / arXiv
  • MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting Through Multi-View Fusion of LiDAR Data / paper / CVPR 2021 / MVFuseNet
  • UniFormer: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View / paper / arXiv
  • BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation / paper / project / arXiv / BEVFusion
  • BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework / paper / project / arXiv / BEVFusion

Industrial Roadmap of BEV Perception

Practical Recipe of BEV Perception

BEV Camera


Conventional Methods Camera 3D Object Detection

  • Monocular 3D Object Detection for Autonomous Driving / paper / CVPR 2016 / Mono3D
  • 3D Bounding Box Estimation Using Deep Learning and Geometry / paper / CVPR 2017 / Deep3DBox
  • 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare / paper / video / project / CVPR 2018 / 3D-RCNN
  • Objects as Points / paper / project / arXiv / CenterNet
  • Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving / paper / supplemental / project / CVPR 2019 / Pseudo-Lidar
  • M3D-RPN: Monocular 3D Region Proposal Network for Object Detection / paper / video / project / ICCV 2019 / M3D-RPN
  • Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction / paper / supplemental / project / CVPR 2019 / MonoPSR
  • Orthographic Feature Transform for Monocular 3D Object Detection / paper / project / arXiv / OFTNet
  • ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape / paper / supplemental / CVPR 2019 / ROI-10D
  • SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation / paper / project / CVPR 2020 / SMOKE
  • Categorical Depth Distribution Network for Monocular 3D Object Detection / paper / supplemental / project / CVPR 2021 / CaDDN
  • FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection / paper / supplemental / project / ICCV 2021 / FCOS3D
  • FCOS: Fully Convolutional One-Stage Object Detection / paper / project / ICCV 2019 / FCOS
  • Probabilistic and Geometric Depth: Detecting Objects in Perspective / paper / project / ? / PGD

Conventional Methods LiDAR Detection

  • Deep Hough Voting for 3D Object Detection in Point Clouds / paper / supplemental / video / project / ICCV 2019 / VoteNet
  • PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud / paper / project / CVPR 2019 / PointRCNN
  • From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network / paper / project / TPAMI 2021 / Part-$A^2$
  • H3DNet: 3D Object Detection Using Hybrid Geometric Primitives / paper / project / ECCV 2020 / H3DNet
  • 3D Object Detection With Pointformer / paper / supplemental / project / CVPR 2021 / Pointformer
  • Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds / paper / project / CVPR 2021 / BRNet
  • Group-Free 3D Object Detection via Transformers / paper / supplemental / project / ICCV 2021 / Group-Free
  • RBGNet: Ray-Based Grouping for 3D Object Detection / paper / supplemental / project / CVPR 2022 / RBGNet
  • 3DSSD: Point-Based 3D Single Stage Object Detector / paper / project / CVPR 2020 / 3DSSD

Conventional Methods LiDAR Segmentation

  • PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation / paper / supplemental / video / project / CVPR 2017 / PointNet
  • PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space / paper / project / NIPS 2017 / PointNet++
  • SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters / paper / project / ECCV 2018 / SpiderCNN
  • Dynamic Graph CNN for Learning on Point Clouds / paper / project / ACM Transactions on Graphics 2019 / DGCNN
  • KPConv: Flexible and Deformable Convolution for Point Clouds / paper / supplemental / project / ICCV 2019 / KPConv
  • Point Transformer / paper / supplemental / project / ICCV 2021 / Point Transformer
  • RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds / paper / supplemental / project / CVPR 2020 / RandLA-Net
  • PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation / paper / video / project / CVPR 2020 / PolarNet
  • Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation / paper / project / CVPR 2021 / Cylinder3D
  • (AF)2-S3Net: Attentive Feature Fusion With Adaptive Feature Selection for Sparse Semantic Segmentation Network / paper / CVPR 2021 / (AF)$^2$-S3Net
  • TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module / paper / ICRA 2021 / TornadoNet
  • AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic Segmentation / paper / arXiv /AMVNet
  • DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation / paper / ICCV 2021 /DRINet
  • DRINet++: Efficient Voxel-as-point Point Cloud Segmentation / paper / arXiv / DRINet++
  • Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution / paper / project / ECCV 2020 / SPVConv
  • RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation / paper / ICCV 2021 / RPVNet
  • Learning 3D Semantic Segmentation with only 2D Image Supervision / paper / 3DV 2021 / 2D3DNet
  • 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds / paper / project / ECCV2022 / 2DPASS

Conventional Methods Sensor Fusion

  • MVX-Net: Multimodal VoxelNet for 3D Object Detection / paper / ICRA 2019 / MVX-Net
  • Multi-Task Multi-Sensor Fusion for 3D Object Detection / paper / CVPR 2019 / MMF
  • Deep Continuous Fusion for Multi-Sensor 3D Object Detection / paper / ECCV 2018 / ContFuse
  • PointAugmenting: Cross-Modal Augmentation for 3D Object Detection / paper / project / CVPR 2021 / PointAugmenting
  • AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection / paper / project / ECCV 2022 / AutoAlignV2
  • DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection / paper / supplemental / project / CVPR 2022 / DeepFusion
  • CenterFusion: Center-Based Radar and Camera Fusion for 3D Object Detection / paper / project / WACV 2021 / CenterFusion
  • FUTR3D: A Unified Sensor Fusion Framework for 3D Detection / paper / arXiv / FUTR3D
  • TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection With Transformers / paper / supplemental / project / CVPR 2022 / TransFusion
  • DeepInteraction: 3D Object Detection via Modality Interaction / paper / project / arXiv / DeepInteraction
  • PointPainting: Sequential Fusion for 3D Object Detection / paper / supplemental / CVPR 2020 / PointPainting
  • Frustum PointNets for 3D Object Detection From RGB-D Data / paper / supplemental / project / CVPR 2018 / F-PointNet
  • Multi-View 3D Object Detection Network for Autonomous Driving / paper / video / CVPR 2017 / MV3D
  • Joint 3D Proposal Generation and Object Detection from View Aggregation / paper / project / IROS 2018 / AVOD
  • CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection / paper / project / IROS 2020 / CLOCs


Awesome BEV perception papers and cookbook for achieving SOTA results

License:Apache License 2.0