niubencoolboy / DriveAGI

Embracing Foundation Models into Autonomous Agent and System

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DriveAGI

This is "The One" project that OpenDriveLab is committed to contribute to the community, providing some thought and general picture of how to embrace foundation models into autonomous driving.

At A Glance

Here are some key components to construct a large foundation model curated for an autonomous system.

overview

DriveData

Abstract

With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem. In this survey, we provide a comprehensive analysis of more than 70 papers on the timeline, impact, challenges, and future trends in autonomous driving dataset.

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future

@misc{li2023opensourced,
      title={Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future}, 
      author={Hongyang Li and Yang Li and Huijie Wang and Jia Zeng and Pinlong Cai and Huilin Xu and Dahua Lin and Junchi Yan and Feng Xu and Lu Xiong and Jingdong Wang and Futang Zhu and Kai Yan and Chunjing Xu and Tiancai Wang and > Beipeng Mu and Shaoqing Ren and Zhihui Peng and Yu Qiao},
      year={2023},
      eprint={2312.03408},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

overview

Current autonomous driving datasets can broadly be categorized into two generations since the 2010s. We define the Impact (y-axis) of a dataset based on sensor configuration, input modality, task category, data scale, ecosystem, etc.

overview

Related Work Collection

We present comprehensive paper collections, leaderboards, and challenges.(Click to expand)

Challenges and Leaderboards
Title Host Year Task Entry
Autonomous Driving Challenge OpenDriveLab CVPR2023 Perception / OpenLane Topology 111
Perception / Online HD Map Construction
Perception / 3D Occupancy Prediction
Prediction & Planning / nuPlan Planning
Waymo Open Dataset Challenges Waymo CVPR2023 Perception / 2D Video Panoptic Segmentation 35
Perception / Pose Estimation
Prediction / Motion Prediction
Prediction / Sim Agents
CVPR2022 Prediction / Motion Prediction 128
Prediction / Occupancy and Flow Prediction
Perception / 3D Semantic Segmentation
Perception / 3D Camera-only Detection
CVPR2021 Prediction / Motion Prediction 115
Prediction / Interaction Prediction
Perception / Real-time 3D Detection
Perception / Real-time 2D Detection
Argoverse Challenges Argoverse CVPR2023 Prediction / Multi-agent Forecasting 81
Perception & Prediction / Unified Sensorbased Detection, Tracking, and Forecasting
Perception / LiDAR Scene Flow
Prediction / 3D Occupancy Forecasting
CVPR2022 Perception / 3D Object Detection 81
Prediction / Motion Forecasting
Perception / Stereo Depth Estimation
CVPR2021 Perception / Stereo Depth Estimation 368
Prediction / Motion Forecasting
Perception / Streaming 2D Detection
CARLA Autonomous Driving Challenge CARLA Team, Intel 2023 Planning / CARLA AD Challenge 2.0 -
NeurIPS2022 Planning / CARLA AD Challenge 1.0 19
NeurIPS2021 Planning / CARLA AD Challenge 1.0 -
粤港澳大湾区 (黄埔)国际算法算例大赛 琶洲实验室 2023 感知 / 跨场景单目深度估计 -
感知 / 路侧毫米波雷达标定和目标跟踪 -
2022 感知 / 路侧三维感知算法 -
感知 / 街景图像店面招牌文字识别 -
AI Driving Olympics ETH Zurich, University of Montreal,Motional NeurIP2021 Perception / nuScenes Panoptic 11
ICRA2021 Perception / nuScenes Detection 456
Perception / nuScenes Tracking
Prediction / nuScenes Prediction
Perception / nuScenes LiDAR Segmentation
计图 (Jittor)人工智能算法挑战赛 国家自然科学基金委信息科学部 2021 感知 / 交通标志检测 37
KITTI Vision Benchmark Suite University of Tübingen 2012 Perception / Stereo, Flow, Scene Flow, Depth, Odometry, Object, Tracking, Road, Semantics 5,610

(back to top)

Perception Datasets
Dataset Year Diversity Sensor Annotation Paper
Scenes Hours Region Camera Lidar Other
KITTI 2012 50 6 EU Font-view GPS & IMU 2D BBox & 3D BBox Link
Cityscapes 2016 - - EU Font-view 2D Seg Link
Lost and Found 2016 112 - - Font-view 2D Seg Link
Mapillary 2016 - - Global Street-view 2D Seg Link
DDD17 2017 36 12 EU Front-view GPS & CAN-bus & Event Camera - Link
Apolloscape 2016 103 2.5 AS Front-view GPS & IMU 3D BBox & 2D Seg Link
BDD-X 2018 6984 77 NA Front-view Language Link
HDD 2018 - 104 NA Front-view GPS & IMU & CAN-bus 2D BBox Link
IDD 2018 182 - AS Front-view 2D Seg Link
SemanticKITTI 2019 50 6 EU 3D Seg Link
Woodscape 2019 - - Global 360° GPS & IMU & CAN-bus 3D BBox & 2D Seg Link
DrivingStereo 2019 42 - AS Front-view - Link
Brno-Urban 2019 67 10 EU Front-view GPS & IMU & Infrared Camera - Link
A*3D 2019 - 55 AS Front-view 3D BBox Link
Talk2Car 2019 850 283.3 NA Front-view Language & 3D BBox Link
Talk2Nav 2019 10714 - Sim 360° Language Link
PIE 2019 - 6 NA Front-view 2D BBox Link
UrbanLoco 2019 13 - AS & NA 360° IMU - Link
TITAN 2019 700 - AS Front-view 2D BBox Link
H3D 2019 160 0.77 NA Front-view GPS & IMU - Link
A2D2 2020 - 5.6 EU 360° GPS & IMU & CAN-bus 3D BBox & 2D Seg Link
CARRADA 2020 30 0.3 NA Front-view Radar 3D BBox Link
DAWN 2019 - - Global Front-view 2D BBox Link
4Seasons 2019 - - - Front-view GPS & IMU - Link
UNDD 2019 - - - Front-view 2D Seg Link
SemanticPOSS 2020 - - AS GPS & IMU 3D Seg Link
Toronto-3D 2020 4 - NA 3D Seg Link
ROAD 2021 22 - EU Front-view 2D BBox & Topology Link
Reasonable Crowd 2021 - - Sim Front-view Language Link
METEOR 2021 1250 20.9 AS Front-view GPS Language Link
PandaSet 2021 179 - NA 360° GPS & IMU 3D BBox Link
MUAD 2022 - - Sim 360° 2D Seg& 2D BBox Link
TAS-NIR 2022 - - - Front-view Infrared Camera 2D Seg Link
LiDAR-CS 2022 6 - Sim 3D BBox Link
WildDash 2022 - - - Front-view 2D Seg Link
OpenScene 2023 1000 5.5 AS & NA 360° 3D Occ Link
ZOD 2023 1473 8.2 EU 360° GPS & IMU & CAN-bus 3D BBox & 2D Seg Link
nuScenes 2019 1000 5.5 AS & NA 360° GPS & CAN-bus & Radar & HDMap 3D BBox & 3D Seg Link
Argoverse V1 2019 324k 320 NA 360° HDMap 3D BBox & 3D Seg Link
Waymo 2019 1000 6.4 NA 360° 2D BBox & 3D BBox Link
KITTI-360 2020 366 2.5 EU 360° 3D BBox & 3D Seg Link
ONCE 2021 - 144 AS 360° 3D BBox Link
nuPlan 2021 - 120 AS & NA 360° 3D BBox Link
Argoverse V2 2022 1000 4 NA 360° HDMap 3D BBox Link
DriveLM 2023 1000 5.5 AS & NA 360° Language Link

(back to top)

Mapping Datasets
Dataset Year Diversity Sensor Annotation Paper
Scenes Frames Camera Lidar Type Space Inst. Track
Caltech Lanes 2008 4 1224/1224 PV Link
VPG 2017 - 20K/20K PV - Link
TUsimple 2017 6.4K 6.4K/128K PV Link
CULane 2018 - 133K/133K PV - Link
ApolloScape 2018 235 115K/115K PV Link
LLAMAS 2019 14 79K/100K Front-view Image Laneline PV Link
3D Synthetic 2020 - 10K/10K PV - Link
CurveLanes 2020 - 150K/150K PV - Link
VIL-100 2021 100 10K/10K PV Link
OpenLane-V1 2022 1K 200K/200K 3D Link
ONCE-3DLane 2022 - 211K/211K 3D - Link
OpenLane-V2 2023 2K 72K/72K Multi-view Image Lane Centerline, Lane Segment 3D Link
Prediction and Planning Datasets
Subtask Input Output Evaluation Dataset
Motion Prediction Surrounding Traffic States Spatiotemporal Trajectories of Single/Multiple Vehicle(s) Displacement Error Argoverse
nuScenes
Waymo
Interaction
MONA
Trajectory Planning Motion States for Ego Vehicles, Scenario Cognition and Prediction Trajectories for Ego Vehicles Displacement Error, Safety, Compliance, Comfort nuPlan
CARLA
MetaDrive
Apollo
Path Planning Maps for Road Network Routes Connecting to Nodes and Links Efficiency, Energy Conservation OpenStreetMap
Transportation Networks
DTAlite
PeMS
New York City Taxi Data

Below we would like to share the latest update from our team on the DriveData side. We will release the detail of the DriveEngine and the DriveAGI in the future.

DriveLM

Introducing the First benchmark on Language Prompt for Driving.

Quick facts:

OpenScene

The Largest up-to-date 3D Occupancy Forecasting dataset for visual pre-training.

Quick facts:

OpenLane-V2 Update

Flourishing OpenLane-V2 with Standard Definition (SD) Map and Scene Elements.

Quick facts:

Collaborating Organizations

overview

About

Embracing Foundation Models into Autonomous Agent and System

License:Apache License 2.0