Little-Podi / Collaborative_Perception

This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Collaborative Perception

This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario. Papers are listed in alphabetical order of the first character.

🔗Jump to:

Note: I find it hard to fairly compare all methods on each benchmark since some published results are obtained without specified training and testing settings, or even modified model architectures. In fact, many works evaluate all baselines under their own settings and report them. Therefore, it is probably to find inconsistency between papers. Hence, I discard the collection and reproducton of all the benchmarks in a previous update. If you are interested, you can find plenty of results in this archived version.

🌟Recommendation

Helpful Learning Resource:thumbsup::thumbsup::thumbsup:

  • (Survey) Collaborative Perception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities [paper], V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges [paper], Towards Vehicle-to-Everything Autonomous Driving: A Survey on Collaborative Perception [paper], Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges [paper], A Survey and Framework of Cooperative Perception: From Heterogeneous Singleton to Hierarchical Cooperation [paper]
  • (Talk) Vehicle-to-Vehicle (V2V) Communication (Waabi CVPR 24 Tutorial on Self-Driving Cars) [video], Vehicle-to-Vehicle (V2V) Communication (Waabi CVPR 23 Tutorial on Self-Driving Cars) [video], The Ultimate Solution for L4 Autonomous Driving [video], When Vision Transformers Meet Cooperative Perception [video], Scene Understanding beyond the Visible [video], Robust Collaborative Perception against Communication Interruption [video], Collaborative and Adversarial 3D Perception for Autonomous Driving [video], Vehicle-to-Vehicle Communication for Self-Driving [video], Adversarial Robustness for Self-Driving [video], L4感知系统的终极形态:协同驾驶 [video], CoBEVFlow-解决车-车/路协同感知的时序异步问题 [video], 新一代协作感知Where2comm减少通信带宽十万倍 [video], 从任务相关到任务无关的多机器人协同感知 [video], 协同自动驾驶:仿真与感知 [video], 基于群体协作的超视距态势感知 [video]
  • (Library) V2Xverse: A Codebase for V2X-Based Collaborative End2End Autonomous Driving [code] [doc], HEAL: An Extensible Framework for Open Heterogeneous Collaborative Perception [code] [doc], OpenCOOD: Open Cooperative Detection Framework for Autonomous Driving [code] [doc], CoPerception: SDK for Collaborative Perception [code] [doc], OpenCDA: Simulation Tool Integrated with Prototype Cooperative Driving Automation [code] [doc]
  • (People) Runsheng Xu@UCLA [web], Hao Xiang@UCLA [web], Yiming Li@NYU [web], Zixing Lei@SJTU [web], Yifan Lu@SJTU [web], Siqi Fan@THU [web], Hang Qiu@Waymo [web], Dian Chen@UT Austin [web], Yen-Cheng Liu@GaTech [web], Tsun-Hsuan Wang@MIT [web]
  • (Workshop) Co-Intelligence@ECCV'24 [web], CoPerception@ICRA'23 [web], ScalableAD@ICRA'23 [web]
  • (Background) Current Approaches and Future Directions for Point Cloud Object Detection in Intelligent Agents [video], 3D Object Detection for Autonomous Driving: A Review and New Outlooks [paper], DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning [video], A Survey of Multi-Agent Reinforcement Learning with Communication [paper]

Typical Collaboration Modes:handshake::handshake::handshake:

Possible Optimization Directions:fire::fire::fire:

🔖Method and Framework

Note: {Related} denotes that it is not a pure collaborative perception paper but has related content.

Selected Preprint

  • AR2VP (Dynamic V2X Autonomous Perception from Road-to-Vehicle Vision) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Segmentation
    • Input: Point Cloud
  • CPPC (Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • CMP (CMP: Cooperative Motion Prediction with Multi-Agent Communication) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Forecasting
    • Input: Point Cloud
  • CoBEVFusion (CoBEVFusion: Cooperative Perception with LiDAR-Camera Bird's-Eye View Fusion) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Segmentation, Detection
    • Input: RGB Image, Point Cloud
  • CoBEVGlue (Self-Localized Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • CoCMT (CoCMT: Towards Communication-Efficient Corss-Modal Transformer For Collaborative Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real
    • Task: Detection
    • Input: RGB Image, Point Cloud
  • CoDriving (Towards Collaborative Autonomous Driving: Simulation Platform and End-to-End System) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real, V2X-Sim, DAIR-V2X
    • Task: Planning
    • Input: RGB Image, Point Cloud
  • CoDrivingLLM (Towards Interactive and Learnable Cooperative Driving Automation: A Large Language Model-Driven Decision-making Framework) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: Unknown
    • Task: Planning
    • Input: Scene State
  • CollaMamba (CollaMamba: Efficient Collaborative Perception with Cross-Agent Spatial-Temporal State Space Model) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, V2V4Real
    • Task: Detection
    • Input: Point Cloud
  • CoMamba (CoMamba: Real-time Cooperative Perception Unlocked with State Space Models) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, V2V4Real
    • Task: Detection
    • Input: Point Cloud
  • CooPre (CooPre: Cooperative Pretraining for V2X Cooperative Perception)
    • Mode: Early Collaboration, Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real, V2X-Real
    • Task: Detection
    • Input: Point Cloud
  • CP-Guard+ (CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: CP-GuardBench, V2X-Sim
    • Task: Detection
    • Input: Point Cloud
  • CTCE (Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X-Seq, V2X-Sim
    • Task: Detection
    • Input: RGB Image
  • Debrief (Talking Vehicles: Cooperative Driving via Natural Language) [paper&review] [code]
    • Mode: Late Collaboration
    • Dataset: TalkingVehiclesGym, CARLA (simulator)
    • Task: Planning
    • Input: Scene State
  • DiffCP (DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Detection
    • Input: Point Cloud
  • Direct-CP (Direct-CP: Directed Collaborative Perception for Connected and Autonomous Vehicles via Proactive Attention) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud
  • LMMCoDrive (LMMCoDrive: Cooperative Driving with Large Multimodal Model) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: AMoD, CARLA (simulator)
    • Task: Planning
    • Input: Scene State
  • MOT-CUP (Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation) [paper] [code]
    • Mode: Early Collaboration, Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Tracking
    • Input: Point Cloud
  • RopeBEV (RopeBEV: A Multi-Camera Roadside Perception Network in Bird’s-Eye-View)
    • Mode: Intermediate Collaboration
    • Dataset: RoScenes
    • Task: Segmentation
    • Input: RGB Image
  • ParCon (ParCon: Noise-Robust Collaborative Perception via Multi-Module Parallel Connection) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • PragComm (Pragmatic Communication in Multi-Agent Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2X-Sim, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • QUEST (QUEST: Query Stream for Vehicle-Infrastructure Cooperative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X-Seq
    • Task: Detection
    • Input: RGB Image
  • RCDN (RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-Based 3D Neural Modeling) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V-N
    • Task: Segmentation
    • Input: RGB Image
  • SiCP (SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Detection
    • Input: Point Cloud
  • STAMP (STAMP: Scalable Task- And Model-Agnostic Collaborative Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real
    • Task: Segmentation, Detection
    • Input: Point Cloud
  • UniV2X (End-to-End Autonomous Driving through V2X Cooperation) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X
    • Task: Planning, Occupancy, Segmentation, Tracking
    • Input: RGB Image
  • VIMI (VIMI: Vehicle-Infrastructure Multi-View Intermediate Fusion for Camera-Based 3D Object Detection) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X
    • Task: Detection
    • Input: RGB Image
  • V2X-DGW (V2X-DGW: Domain Generalization for Multi-Agent Perception under Adverse Weather Conditions) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V-w, V2XSet-w
    • Task: Detection
    • Input: Point Cloud
  • V2X-M2C (V2X-M2C: Efficient Multi-Module Collaborative Perception with Two Connections) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet
    • Task: Detection
    • Input: Point Cloud
  • V2X-PC (V2X-PC: Vehicle-to-Everything Collaborative Perception via Point Cluster) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • V2X-R (V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-R
    • Task: Detection
    • Input: Point Cloud

CVPR 2024

  • CoHFF (Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V+
    • Task: Occupancy, Segmentation, Detection
    • Input: RGB Image
  • CoopDet3D (TUMTraf V2X Cooperative Perception Dataset) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: TUMTraf-V2X
    • Task: Detection
    • Input: RGB Image, Point Cloud
  • CodeFilling (Communication-Efficient Collaborative Perception via Information Filling with Codebook) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X, OPV2VH+
    • Task: Detection
    • Input: RGB Image, Point Cloud
  • MRCNet (Multi-Agent Collaborative Perception via Motion-Aware Robust Communication Network) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, V2X-Sim
    • Task: Detection
    • Input: Point Cloud

ECCV 2024

  • Hetecooper (Hetecooper: Feature Collaboration Graph for Heterogeneous Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real
    • Task: Detection
    • Input: Point Cloud
  • Infra-Centric CP (Rethinking the Role of Infrastructure in Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2XSet, V2X-Sim
    • Task: Detection
    • Input: Point Cloud

NeurIPS 2024

  • V2X-Graph (Learning Cooperative Trajectory Representations for Motion Forecasting) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X-Seq, DAIR-V2X-Traj
    • Task: Forecasting
    • Input: Vector Map

ICLR 2024

  • HEAL (An Extensible Framework for Open Heterogeneous Collaborative Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V-H, DAIR-V2X
    • Task: Detection
    • Input: RGB Image, Point Cloud

AAAI 2024

  • CMiMC (What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud
  • DI-V2X (DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • V2XFormer (DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DeepAccident
    • Task: Detection, Forecasting
    • Input: RGB Image

WACV 2024

  • MACP (MACP: Efficient Model Adaptation for Cooperative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real
    • Task: Detection
    • Input: Point Cloud

ICRA 2024

  • FreeAlign (Robust Collaborative Perception without External Localization and Clock Devices) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud

CVPR 2023

  • {Related} BEVHeight (BEVHeight: A Robust Framework for Vision-Based Roadside 3D Object Detection) [paper] [code]
    • Mode: No Collaboration (only infrastructure data)
    • Dataset: DAIR-V2X, V2X-Sim
    • Task: Detection
    • Input: RGB Image
  • CoCa3D (Collaboration Helps Camera Overtake LiDAR in 3D Detection) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V+, DAIR-V2X, CoPerception-UAV+
    • Task: Detection
    • Input: RGB Image
  • FF-Tracking (V2X-Seq: The Large-Scale Sequential Dataset for the Vehicle-Infrastructure Cooperative Perception and Forecasting) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X-Seq
    • Task: Tracking
    • Input: Point Cloud

NeurIPS 2023

  • CoBEVFlow (Robust Asynchronous Collaborative 3D Detection via Bird's Eye View Flow) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X, IRV2V
    • Task: Detection
    • Input: Point Cloud
  • FFNet (Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • How2comm (How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud

ICCV 2023

  • CORE (CORE: Cooperative Reconstruction for Multi-Agent Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Segmentation, Detection
    • Input: Point Cloud
  • HM-ViT (HM-ViT: Hetero-Modal Vehicle-to-Vehicle Cooperative Perception with Vision Transformer) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Detection
    • Input: RGB Image, Point Cloud
  • ROBOSAC (Among Us: Adversarially Robust Collaborative Perception by Consensus) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud
  • SCOPE (Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, DAIR-V2X
    • Task: Segmentation, Detection
    • Input: Point Cloud
  • TransIFF (TransIFF: An Instance-Level Feature Fusion Framework for Vehicle-Infrastructure Cooperative 3D Detection with Transformers) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • UMC (UMC: A Unified Bandwidth-Efficient and Multi-Resolution Based Collaborative Perception Framework) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2X-Sim
    • Task: Detection
    • Input: Point Cloud

ICLR 2023

  • {Related} CO3 (CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving) [paper&review] [code]
    • Mode: Early Collaboration (for contrastive learning)
    • Dataset: DAIR-V2X
    • Task: Representation Learning
    • Input: Point Cloud

CoRL 2023

  • BM2CP {BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities} [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, DAIR-V2X
    • Task: Detection
    • Input: RGB Image, Point Cloud

MM 2023

  • DUSA (DUSA: Decoupled Unsupervised Sim2Real Adaptation for Vehicle-to-Everything Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • FeaCo (FeaCo: Reaching Robust Feature-Level Consensus in Noisy Pose Conditions) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2V4Real
    • Task: Detection
    • Input: Point Cloud
  • What2comm (What2comm: Towards Communication-Efficient Collaborative Perception via Feature Decoupling) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, V2XSet, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud

WACV 2023

  • AdaFusion (Adaptive Feature Fusion for Cooperative Perception Using LiDAR Point Clouds) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, CODD
    • Task: Detection
    • Input: Point Cloud

ICRA 2023

  • CoAlign (Robust Collaborative 3D Object Detection in Presence of Pose Errors) [paper] [code]
    • Mode: Intermediate Collaboration, Late Collaboration
    • Dataset: OPV2V, V2X-Sim, DAIR-V2X
    • Task: Detection
    • Input: Point Cloud
  • {Related} DMGM (Deep Masked Graph Matching for Correspondence Identification in Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: CAD
    • Task: Correspondence Identification
    • Input: RGBD Image
  • Double-M Quantification (Uncertainty Quantification of Collaborative Detection for Self-Driving) [paper] [code]
    • Mode: Early Collaboration, Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud
  • MAMP (Model-Agnostic Multi-Agent Perception Framework) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: OPV2V
    • Task: Detection
    • Input: Point Cloud
  • MATE (Communication-Critical Planning via Multi-Agent Trajectory Exchange) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: AutoCastSim (simulator), CoBEV-Sim (simulator)
    • Task: Planning
    • Input: Point Cloud
  • MPDA (Bridging the Domain Gap for Multi-Agent Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2XSet
    • Task: Detection
    • Input: Point Cloud
  • WNT (We Need to Talk: Identifying and Overcoming Communication-Critical Scenarios for Self-Driving) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: CoBEV-Sim (simulator)
    • Task: Planning
    • Input: Point Cloud

CVPR 2022

  • Coopernaut (COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: AutoCastSim (simulator)
    • Task: Planning
    • Input: Point Cloud
  • {Related} LAV (Learning from All Vehicles) [paper] [code]
    • Mode: Late Collaboration (for training)
    • Dataset: CARLA (simulator)
    • Task: Planning, Detection (auxiliary supervision), Segmentation (auxiliary supervision)
    • Input: RGB Image, Point Cloud
  • TCLF (DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: DAIR-V2X
    • Task: Detection
    • Input: RGB Image, Point Cloud

NeurIPS 2022

  • Where2comm (Where2comm: Efficient Collaborative Perception via Spatial Confidence Maps) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, DAIR-V2X, V2X-Sim, CoPerception-UAV
    • Task: Detection
    • Input: Point Cloud

ECCV 2022

  • SyncNet (Latency-Aware Collaborative Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud
  • V2X-ViT (V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2XSet
    • Task: Detection
    • Input: Point Cloud

CoRL 2022

  • CoBEVT (CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V, nuScenes
    • Task: Segmentation, Detection
    • Input: RGB Image, Point Cloud
  • STAR (Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception) [paper&review] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Segmentation, Detection
    • Input: Point Cloud

IJCAI 2022

  • IA-RCP (Robust Collaborative Perception against Communication Interruption) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud

MM 2022

  • CRCNet (Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud

ICRA 2022

  • AttFuse (OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: OPV2V
    • Task: Detection
    • Input: Point Cloud
  • MP-Pose (Multi-Robot Collaborative Perception with Graph Neural Networks) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: AirSim-MAP
    • Task: Segmentation
    • Input: RGB Image

NeurIPS 2021

  • DiscoNet (Learning Distilled Collaboration Graph for Multi-Agent Perception) [paper&review] [code]
    • Mode: Early Collaboration (teacher model), Intermediate Collaboration (student model)
    • Dataset: V2X-Sim
    • Task: Detection
    • Input: Point Cloud

ICCV 2021

  • Adversarial V2V (Adversarial Attacks On Multi-Agent Communication) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2V-Sim (not publicly available)
    • Task: Adversarial Attack
    • Input: Point Cloud

IROS 2021

  • MASH (Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial Handshaking) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: AirSim
    • Task: Segmentation
    • Input: RGB Image

CVPR 2020

  • When2com (When2com: Multi-Agent Perception via Communication Graph Grouping) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: AirSim-MAP
    • Task: Segmentation, Classification
    • Input: RGB Image

ECCV 2020

  • DSDNet (DSDNet: Deep Structured Self-Driving Network) [paper] [code]
    • Mode: Late Collaboration
    • Dataset: nuScenes, CARLA (simulator), ATG4D
    • Task: Planning
    • Input: Point Cloud
  • V2VNet (V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2V-Sim (not publicly available)
    • Task: Detection, Forecasting
    • Input: Point Cloud

CoRL 2020

  • Robust V2V (Learning to Communicate and Correct Pose Errors) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: V2V-Sim (not publicly available)
    • Task: Detection, Forecasting
    • Input: Point Cloud

ICRA 2020

  • Who2com (Who2com: Collaborative Perception via Learnable Handshake Communication) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: AirSim-CP (has an asynchronous issue between views)
    • Task: Segmentation
    • Input: RGB Image
  • MAIN (Enhancing Multi-Robot Perception via Learned Data Association) [paper] [code]
    • Mode: Intermediate Collaboration
    • Dataset: AirSim
    • Task: Segmentation
    • Input: RGB Image

🔖Dataset and Simulator

Note: {Real} denotes that the sensor data is obtained by real-world collection instead of simulation.

Selected Preprint

  • Adver-City (Adver-City: Open-Source Multi-Modal Dataset for Collaborative Perception Under Adverse Weather Conditions) [paper] [code] [project]
  • CP-GuardBench (CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception) [paper&review] [code] [project]
  • {Real} InScope (InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios) [paper] [code] [project]
  • Multi-V2X (Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception) [paper] [code] [project]
  • OPV2V-N (RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling) [paper] [code] [project]
  • V2X-R (V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion) [paper] [code] [project]
  • {Real} V2X-Radar (V2X-Radar: A Multi-Modal Dataset with 4D Radar for Cooperative Perception) [paper] [code] [project]
  • {Real} V2X-Real (V2X-Real: a Large-Scale Dataset for Vehicle-to-Everything Cooperative Perception) [paper] [code] [project]
  • WHALES (WHALES: A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving) [paper] [code] [project]

CVPR 2024

  • {Real} HoloVIC (HoloVIC: Large-Scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative) [paper] [code] [project]
  • {Real} Open Mars Dataset (Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset) [code] [paper] [project]
  • {Real} RCooper (RCooper: A Real-World Large-Scale Dataset for Roadside Cooperative Perception) [paper] [code] [project]
  • {Real} TUMTraf-V2X (TUMTraf V2X Cooperative Perception Dataset) [paper] [code] [project]

ECCV 2024

  • {Real} H-V2X (H-V2X: A Large Scale Highway Dataset for BEV Perception) [paper] [code] [project]

NeurIPS 2024

  • {Real} DAIR-V2X-Traj (Learning Cooperative Trajectory Representations for Motion Forecasting) [paper] [code] [project]

ICLR 2024

  • OPV2V-H (An Extensible Framework for Open Heterogeneous Collaborative Perception) [paper&review] [code] [project]

AAAI 2024

  • DeepAccident (DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving) [paper] [code] [project]

CVPR 2023

  • CoPerception-UAV+ (Collaboration Helps Camera Overtake LiDAR in 3D Detection) [paper] [code] [project]
  • OPV2V+ (Collaboration Helps Camera Overtake LiDAR in 3D Detection) [paper] [code] [project]
  • {Real} V2V4Real (V2V4Real: A Large-Scale Real-World Dataset for Vehicle-to-Vehicle Cooperative Perception) [paper] [code] [project]
  • {Real} DAIR-V2X-Seq (V2X-Seq: The Large-Scale Sequential Dataset for the Vehicle-Infrastructure Cooperative Perception and Forecasting) [paper] [code] [project]

NeurIPS 2023

  • IRV2V (Robust Asynchronous Collaborative 3D Detection via Bird's Eye View Flow) [paper&review] [code] [project]

ICCV 2023

  • Roadside-Opt (Optimizing the Placement of Roadside LiDARs for Autonomous Driving) [paper] [code] [project]

ICRA 2023

  • {Real} DAIR-V2X-C Complemented (Robust Collaborative 3D Object Detection in Presence of Pose Errors) [paper] [code] [project]
  • RLS (Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library) [paper] [code] [project]
  • V2XP-ASG (V2XP-ASG: Generating Adversarial Scenes for Vehicle-to-Everything Perception) [paper] [code] [project]

CVPR 2022

  • AutoCastSim (COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles) [paper] [code] [project]
  • {Real} DAIR-V2X (DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection) [paper] [code] [project]

NeurIPS 2022

  • CoPerception-UAV (Where2comm: Efficient Collaborative Perception via Spatial Confidence Maps) [paper&review] [code] [project]

ECCV 2022

  • V2XSet (V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer) [paper] [code] [project]

ICRA 2022

  • OPV2V (OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication) [paper] [code] [project]

ACCV 2022

  • DOLPHINS (DOLPHINS: Dataset for Collaborative Perception Enabled Harmonious and Interconnected Self-Driving) [paper] [code] [project]

ICCV 2021

  • V2X-Sim (V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving) [paper] [code] [project]

CoRL 2017

About

This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario.