DeepScenario: An Open Driving Scenario Dataset for Autonomous Driving System Testing

Dataset availability - The dataset is available in the corresponding data in this repository. The dataset is also avilable in Zenodo with the identifier data DOI: https://doi.org/10.5281/zenodo.7714194.

Paper availability - The dataset is accepted for the 20th International Conference on Mining Software Repositories (MSR 2023). The paper is publicly available.

Citation of the dataset - Chengjie Lu, Tao Yue, Shaukat Ali, "DeepScenario: An Open Driving Scenario Dataset for Autonomous Driving System Testing", 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), pp.52-56, 2023.

This repository contains:

deepscenario-dataset - DeepScenario dataset, which includes driving scenarios generated by executing three scenario generation strategies: Reinforcement Learning (RL)-based Strategy, Random-based Strategy, Greedy-based Strategy;
deepscenario-toolset - The toolset for DeepScenario dataset, including ScenarioCollector that can automatically collect driving scenarios, and ScenarioRunner that can support replaying driving scenarios. We also provide source code and usage examples for the toolset.

Abstract
DeepScenario Dataset
- Dataset Structure
- Fields in Scenario Attribute Files
DeepScenario Toolset
- Installation
- Usage
Related Efforts
People
Maintainers

Abstract

With the rapid development of autonomous driving systems (ADSs), testing ADSs under various driving conditions has become a key method to ensure the successful deployment of ADS in the real world. However, it is impossible to test all the scenarios due to the inherent complexity and uncertainty of ADSs and the driving tasks. Further, testing of ADSs is expensive regarding time and computational resources. Therefore, a large-scale driving scenario dataset consisting of various driving conditions is needed. To this end, we present an open driving scenario dataset DeepScenario, containing over 30K executable driving scenarios, which are collected by 2880 test executions of three driving scenario generation strategies. Each scenario in the dataset is labeled with six attributes characterizing test results. We further show the attribute statistics and distribution of driving scenarios. For example, there are 1050 collision scenarios, in 917 scenarios there were collisions with other vehicles, 105 and 28 with pedestrians and static obstacles, respectively.

The scenario dataset generation process is shown in the following figure. As the figure shows, to generate driving scenarios, several Test Setups need to be specified. Then we execute an Environment Configuration Framework, which generates critical driving scenarios and tests an ADS by configuring its operating environment. After the test executions, the test results are further used for Dataset Creation. Specifically, Each strategy was set up with three reward functions, reward-ttc, reward-dto, and reward-jerk. We introduced real-world weather data on four different days (i.e., rain-day, rain-night, sunny-day, and sunny-night) to simulate various weather conditions and their changes over time. The test executions were conducted on four roads (i.e., road1, road2, road3, and road4).

DeepScenario Dataset

DeepScenario dataset contains 33530 driving scenarios generated by executing three strategies: (i) rl_based-strategy, (ii) random-strategy, and (iii) greedy-strategy, among which 6703 are generated by rl_based-strategy, which are less than the number of scenarios generated by random-strategy (i.e., 13565) and greedy-strategy (i.e., 13262).

Dataset Structure

The scenario dataset contains three directories corresponding to three strategies. And each strategy directory is organized following three reward function settings. We show the scenario file naming convention and directory structure as follows.

.
├── greedy-strategy
│   ├── reward-dto
│   │   ├── road1-rain_day-scenario-attributes.csv
│   │   ├── road1-rain_day-scenarios
│   │   │   ├── 0_scenario_0.deepscenario
│   │   │   ├── 0_scenario_1.deepscenario
│   │   │   ├── 0_scenario_2.deepscenario
│   │   │   ├── ...
│   │   │   ├── 6_scenario_0.deepscenario
│   │   │   ├── 6_scenario_1.deepscenario
│   │   │   ├── 6_scenario_2.deepscenario
│   │   │   ├── ...
│   │   │   ├── 20_scenario_0.deepscenario
│   │   │   ├── 20_scenario_1.deepscenario
│   │   │   ├── 20_scenario_2.deepscenario
│   │   │   ├── ...
│   │   ├── road1-rain_night-scenario-attributes.csv
│   │   ├── road1-rain_night-scenarios
│   │   │   ├── 0_scenario_0.deepscenario
│   │   │   ├── 0_scenario_1.deepscenario
│   │   │   ├── 0_scenario_2.deepscenario
│   │   │   ├── ...
│   │   ├── road1-sunny_day-scenario-attributes.csv
│   │   ├── road1-sunny_day-scenarios
│   │   │   ├── ...
│   │   ├── road1-sunny_night-scenario-attributes.csv
│   │   ├── road1-sunny_night-scenarios
│   │   │   ├── ...
│   │   ├── ...
│   │   ├── road4-rain_day-scenario-attributes.csv
│   │   ├── road4-rain_day-scenarios
│   │   │   ├── ...
│   │   ├── road4-rain_night-scenario-attributes.csv
│   │   ├── road4-rain_night-scenarios
│   │   │   ├── ...
│   │   ├── road4-sunny_day-scenario-attributes.csv
│   │   ├── road4-sunny_day-scenarios
│   │   │   ├── ...
│   │   ├── road4-sunny_night-scenario-attributes.csv
│   │   └── road4-sunny_night-scenarios
│   │   │   ├── ...
│   ├── reward-jerk
│   │   └── ...
│   └── reward-ttc
│   │   └── ...
├── random-strategy
│   ├── reward-dto
│   │   └── ...
│   ├── reward-jerk
│   │   └── ...
│   └── reward-ttc
│   │   └── ...
├── rl_based-strategy
│   ├── ...
└── dataset-directory-structure.md
157 directories, 33662 files

For example, scenario directory ./greedy-strategy/reward-dto/road1-rain_day-scenarios contains scenarios generated by executing greedy-strategy with test setup reward-dto, road1, rain_day. The attributes of each scenario are listed in ./greedy-strategy/reward-dto/road1-rain_day-scenario-attributes.csv.

Fields in Scenario Attribute Files

We show ./greedy-strategy/reward-dto/road1-rain_day-scenario-attributes.csv as an example of scenario attribute file. Fields in scenario attribute file and their description are shown below.

Execution	ScenarioID	Configuration_API_Description	Attribute[TTC]	Attribute[DTO]	Attribute[Jerk]	Attribute[COL]	Attribute[COLT]	Attribute[SAC]
0	0_scenario_0	A red BoxTruck is overtaking (near) the ego vehicle and maintaining lane.	100000	24.810964	3.4799999999999995	False	None	0
...	...	...	...	...	...	...	...	...
18	18_scenario_4_pedestrian	A skyblue SUV is crossing the road (far) and maintaining lane.	0.0	1.331153	5.299999999999999	True	pedestrian	2.3469087361531504
...	...	...	...	...	...	...	...	...

Execution - This field indicates the ID of the test execution;

ScenarioID - This field indicates the ID of the scenario in the scenario dataset. For example, 0_scenario_0 is the 0th scenario collected when executing execution 0. In this example, the real scenario file can be located by its ScenarioID, which is ./greedy-strategy/reward-dto/road1-rain_day-scenarios/0_scenario_0.deepscenario. 18_scenario_4_pedestrian is another example showing the ScenarioID of a collision scenario, which indicates that the autonomous vehicle collided with a pedestrian. The corresponding scenario file can be located at ./greedy-strategy/reward-dto/road1-rain_day-scenarios/18_scenario_4_pedestrian.deepscenario.

Configuration_API_Description - This field is a brief description of the configuration REST API used to generated the scenario;

Attribute[TTC] - This field is Time-To-Collision (TTC) attribute, which is a safety measure indicating the time it would take for a collision occurs, and smaller TTC values indicate higher safety risk;

Attribute[DTO] - This field is Distance-To-Obstacles (DTO) attribute, which is a safety measure indicating the distances to obstacles when an autonomous vehicle driving in the scenario, and smaller DTO values indicate higher safety risk;

Attribute[Jerk] - This field is Jerk attribute, which is a measure of comfort for passengers, and larger Jerk values indicate less comfort;

Attribute[COL] - This field is Collision (COL), which is a Boolean attribute indicating if the autonomous vehicle collided with obstacles when driving in the scenario;

Attribute[COLT] - This field is Collision Type (COLT), which is an enumerated attribute that shows the type of obstacle the autonomous vehicle collided with;

Attribute[SAC] - This field is Speed-At-Collision (SAC), which is an attribute that records the speed at which the autonomous vehicle in the scenario collided (if it happened) with the obstacle.

DeepScenario Toolset

The DeepScenario toolset includes ScenarioCollector and ScenarioRunner, which has been integrated into PythonAPI developed by SVLSimulator. The toolset has been tested on SVLSimulator Version 2021.1 with running Apollo 5.0 as the autonomous driving system.

Installation

To view the requirement and installation procedure for running the toolset, please look at toolset-installation.

Usage

To see detailed instructions for using ScenarioCollector and ScenarioRunner, please look at toolset-usage.

Related Efforts

DeepCollision is an open-source environment configuration framework, which generates critical driving scenarios and tests an ADS by configuring its operating environment.

More details about the framework are available in the paper: Lu, Chengjie, et al. "Learning Configurations of Operating Environment of Autonomous Vehicles to Maximize their Collisions," in IEEE Transactions on Software Engineering, vol. 49, no. 1, pp. 384-402, 1 Jan. 2023, https://doi.org/10.1109/TSE.2022.3150788.

People

Chengjie Lu https://www.simula.no/people/chengjielu
Tao Yue https://www.simula.no/people/tao
Shaukat Ali https://www.simula.no/people/shaukat

Maintainers

@ChengjieLu

Simula-COMPLEX / DeepScenario