DriveAGI

This is "The One" project that OpenDriveLab is committed to contribute to the community, providing some thought and general picture of how to embrace foundation models into autonomous driving.

At A Glance

Here are some key components to construct a large foundation model curated for an autonomous system.

DriveData

Abstract

With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem. In this survey, we provide a comprehensive analysis of more than 70 papers on the timeline, impact, challenges, and future trends in autonomous driving dataset.

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future

@misc{li2023opensourced,
      title={Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future}, 
      author={Hongyang Li and Yang Li and Huijie Wang and Jia Zeng and Pinlong Cai and Huilin Xu and Dahua Lin and Junchi Yan and Feng Xu and Lu Xiong and Jingdong Wang and Futang Zhu and Kai Yan and Chunjing Xu and Tiancai Wang and > Beipeng Mu and Shaoqing Ren and Zhihui Peng and Yu Qiao},
      year={2023},
      eprint={2312.03408},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Current autonomous driving datasets can broadly be categorized into two generations since the 2010s. We define the Impact (y-axis) of a dataset based on sensor configuration, input modality, task category, data scale, ecosystem, etc.

Related Work Collection

We present comprehensive paper collections, leaderboards, and challenges.(Click to expand)

Challenges and Leaderboards

Title	Host	Year	Task	Entry
Autonomous Driving Challenge	OpenDriveLab	CVPR2023	Perception / OpenLane Topology	111

			Perception / Online HD Map Construction

			Perception / 3D Occupancy Prediction

			Prediction & Planning / nuPlan Planning
Waymo Open Dataset Challenges	Waymo	CVPR2023	Perception / 2D Video Panoptic Segmentation	35

			Perception / Pose Estimation

			Prediction / Motion Prediction

			Prediction / Sim Agents

		CVPR2022	Prediction / Motion Prediction	128

			Prediction / Occupancy and Flow Prediction

			Perception / 3D Semantic Segmentation

			Perception / 3D Camera-only Detection

		CVPR2021	Prediction / Motion Prediction	115

			Prediction / Interaction Prediction

			Perception / Real-time 3D Detection

			Perception / Real-time 2D Detection
Argoverse Challenges	Argoverse	CVPR2023	Prediction / Multi-agent Forecasting	81

			Perception & Prediction / Unified Sensorbased Detection, Tracking, and Forecasting

			Perception / LiDAR Scene Flow

			Prediction / 3D Occupancy Forecasting

		CVPR2022	Perception / 3D Object Detection	81

			Prediction / Motion Forecasting

			Perception / Stereo Depth Estimation

		CVPR2021	Perception / Stereo Depth Estimation	368

			Prediction / Motion Forecasting

			Perception / Streaming 2D Detection
CARLA Autonomous Driving Challenge	CARLA Team, Intel	2023	Planning / CARLA AD Challenge 2.0	-
		2023		-
		NeurIPS2022	Planning / CARLA AD Challenge 1.0	19
		NeurIPS2022		19
		NeurIPS2021	Planning / CARLA AD Challenge 1.0	-
粤港澳大湾区（黄埔）国际算法算例大赛	琶洲实验室	2023	感知 / 跨场景单目深度估计	-

			感知 / 路侧毫米波雷达标定和目标跟踪	-

		2022	感知 / 路侧三维感知算法	-

			感知 / 街景图像店面招牌文字识别	-
AI Driving Olympics	ETH Zurich, University of Montreal,Motional	NeurIP2021	Perception / nuScenes Panoptic	11

		ICRA2021	Perception / nuScenes Detection	456

			Perception / nuScenes Tracking

			Prediction / nuScenes Prediction

			Perception / nuScenes LiDAR Segmentation
计图 (Jittor)人工智能算法挑战赛	国家自然科学基金委信息科学部	2021	感知 / 交通标志检测	37
KITTI Vision Benchmark Suite	University of Tübingen	2012	Perception / Stereo, Flow, Scene Flow, Depth, Odometry, Object, Tracking, Road, Semantics	5,610

(back to top)

Perception Datasets

Dataset	Year	Diversity			Sensor			Annotation	Paper

		Scenes	Hours	Region	Camera	Lidar	Other
KITTI	2012	50	6	EU	Font-view	✗	GPS & IMU	2D BBox & 3D BBox	Link
Cityscapes	2016	-	-	EU	Font-view	✗		2D Seg	Link
Lost and Found	2016	112	-	-	Font-view	✗		2D Seg	Link
Mapillary	2016	-	-	Global	Street-view	✗		2D Seg	Link
DDD17	2017	36	12	EU	Front-view	✗	GPS & CAN-bus & Event Camera	-	Link
Apolloscape	2016	103	2.5	AS	Front-view	✗	GPS & IMU	3D BBox & 2D Seg	Link
BDD-X	2018	6984	77	NA	Front-view	✗		Language	Link
HDD	2018	-	104	NA	Front-view	✓	GPS & IMU & CAN-bus	2D BBox	Link
IDD	2018	182	-	AS	Front-view	✗		2D Seg	Link
SemanticKITTI	2019	50	6	EU	✗	✓		3D Seg	Link
Woodscape	2019	-	-	Global	360°	✓	GPS & IMU & CAN-bus	3D BBox & 2D Seg	Link
DrivingStereo	2019	42	-	AS	Front-view	✓		-	Link
Brno-Urban	2019	67	10	EU	Front-view	✓	GPS & IMU & Infrared Camera	-	Link
A*3D	2019	-	55	AS	Front-view	✓		3D BBox	Link
Talk2Car	2019	850	283.3	NA	Front-view	✓		Language & 3D BBox	Link
Talk2Nav	2019	10714	-	Sim	360°	✗		Language	Link
PIE	2019	-	6	NA	Front-view	✗		2D BBox	Link
UrbanLoco	2019	13	-	AS & NA	360°	✓	IMU	-	Link
TITAN	2019	700	-	AS	Front-view	✗		2D BBox	Link
H3D	2019	160	0.77	NA	Front-view	✓	GPS & IMU	-	Link
A2D2	2020	-	5.6	EU	360°	✓	GPS & IMU & CAN-bus	3D BBox & 2D Seg	Link
CARRADA	2020	30	0.3	NA	Front-view	✗	Radar	3D BBox	Link
DAWN	2019	-	-	Global	Front-view	✗		2D BBox	Link
4Seasons	2019	-	-	-	Front-view	✗	GPS & IMU	-	Link
UNDD	2019	-	-	-	Front-view	✗		2D Seg	Link
SemanticPOSS	2020	-	-	AS	✗	✓	GPS & IMU	3D Seg	Link
Toronto-3D	2020	4	-	NA	✗	✓		3D Seg	Link
ROAD	2021	22	-	EU	Front-view	✗		2D BBox & Topology	Link
Reasonable Crowd	2021	-	-	Sim	Front-view	✗		Language	Link
METEOR	2021	1250	20.9	AS	Front-view	✗	GPS	Language	Link
PandaSet	2021	179	-	NA	360°	✓	GPS & IMU	3D BBox	Link
MUAD	2022	-	-	Sim	360°	✓		2D Seg& 2D BBox	Link
TAS-NIR	2022	-	-	-	Front-view	✗	Infrared Camera	2D Seg	Link
LiDAR-CS	2022	6	-	Sim	✗	✓		3D BBox	Link
WildDash	2022	-	-	-	Front-view	✗		2D Seg	Link
OpenScene	2023	1000	5.5	AS & NA	360°	✗		3D Occ	Link
ZOD	2023	1473	8.2	EU	360°	✓	GPS & IMU & CAN-bus	3D BBox & 2D Seg	Link
nuScenes	2019	1000	5.5	AS & NA	360°	✓	GPS & CAN-bus & Radar & HDMap	3D BBox & 3D Seg	Link
Argoverse V1	2019	324k	320	NA	360°	✓	HDMap	3D BBox & 3D Seg	Link
Waymo	2019	1000	6.4	NA	360°	✓		2D BBox & 3D BBox	Link
KITTI-360	2020	366	2.5	EU	360°	✓		3D BBox & 3D Seg	Link
ONCE	2021	-	144	AS	360°	✓		3D BBox	Link
nuPlan	2021	-	120	AS & NA	360°	✓		3D BBox	Link
Argoverse V2	2022	1000	4	NA	360°	✓	HDMap	3D BBox	Link
DriveLM	2023	1000	5.5	AS & NA	360°	✗		Language	Link

(back to top)

Mapping Datasets

Dataset	Year	Diversity		Sensor		Annotation				Paper

		Scenes	Frames	Camera	Lidar	Type	Space	Inst.	Track
Caltech Lanes	2008	4	1224/1224		✗		PV	✓	✗	Link
VPG	2017	-	20K/20K		✗		PV	✗	-	Link
TUsimple	2017	6.4K	6.4K/128K		✗		PV	✓	✗	Link
CULane	2018	-	133K/133K		✗		PV	✓	-	Link
ApolloScape	2018	235	115K/115K		✓		PV	✗	✗	Link
LLAMAS	2019	14	79K/100K	Front-view Image	✗	Laneline	PV	✓	✗	Link
3D Synthetic	2020	-	10K/10K		✗		PV	✓	-	Link
CurveLanes	2020	-	150K/150K		✗		PV	✓	-	Link
VIL-100	2021	100	10K/10K		✗		PV	✓	✗	Link
OpenLane-V1	2022	1K	200K/200K		✗		3D	✓	✓	Link
ONCE-3DLane	2022	-	211K/211K		✗		3D	✓	-	Link
OpenLane-V2	2023	2K	72K/72K	Multi-view Image	✗	Lane Centerline, Lane Segment	3D	✓	✓	Link

Prediction and Planning Datasets

Subtask	Input	Output	Evaluation	Dataset
Motion Prediction	Surrounding Traffic States	Spatiotemporal Trajectories of Single/Multiple Vehicle(s)	Displacement Error	Argoverse

				nuScenes

				Waymo

				Interaction

				MONA
Trajectory Planning	Motion States for Ego Vehicles, Scenario Cognition and Prediction	Trajectories for Ego Vehicles	Displacement Error, Safety, Compliance, Comfort	nuPlan

				CARLA

				MetaDrive

				Apollo
Path Planning	Maps for Road Network	Routes Connecting to Nodes and Links	Efficiency, Energy Conservation	OpenStreetMap

				Transportation Networks

				DTAlite

				PeMS

				New York City Taxi Data

Below we would like to share the latest update from our team on the DriveData side. We will release the detail of the DriveEngine and the DriveAGI in the future.

DriveLM

Introducing the First benchmark on Language Prompt for Driving.

Quick facts:

Task: given the language prompts as input, predict the trajectory in the scene
Origin dataset: nuScenes
Repo: https://github.com/OpenDriveLab/DriveLM

OpenScene

The Largest up-to-date 3D Occupancy Forecasting dataset for visual pre-training.

Quick facts:

Task: given the large amount of data, predict the 3D occupancy in the environment.
Origin dataset: nuPlan
Repo: https://github.com/OpenDriveLab/OpenScene
Related work: OccNet, 3D Occupancy Prediction Challenge 2023

OpenLane-V2 Update

Flourishing OpenLane-V2 with Standard Definition (SD) Map and Scene Elements.

Quick facts:

Task: given SD-map (also known as ADAS map) and scene elements as input, build the driving scene on the fly without aid of HD-map.
Repo: https://github.com/OpenDriveLab/OpenLane-V2
Related work: TopoNet, Lane Topology Challenge 2023

niubencoolboy / DriveAGI

DriveAGI

At A Glance

DriveData

Abstract

Related Work Collection

DriveLM

OpenScene

OpenLane-V2 Update

Collaborating Organizations

About