coco dataset object-detection sordi computer-vision deeplearning juypter-notebook nvidia-gpu python coverter nocode bmw dali pipeline

Loading BMW SORDI into NVIDIA DALI Pipeline (COCO based)

SORDI dataset has per frame annotation file in json format. Following tools create a COCO style annotation out of it. Thus the SORDI data can be easily fed into COCO style training pipelines.

Sample code to consume the COCO style SORDI with NVIDIA DALI Pipeline is given and can be used in PyTorch/Tensorflow etc.

Prerequisites

NVIDIA Docker 2
Docker CE latest stable release

Unzipping SORDI dataset

Open a Terminal in the SORDI folder and excute the following command to unzip.

for i in *.zip; do unzip "$i"; done

Delete zip files using the following command.

rm -rf *zip

This is how it should look like from a structure after unzipping.

ls -l SORDI

Building and Running the docker image

Base image is coming from NGC cloud (ngc.nvidia.com).

Please register if not done already (3 minutes).

Before you start, map the SORDI directory into the docker:

open 1_run.sh

Change /home/me/SORDI to the path of the extracted SORDI dataset file.

1 - Login to NVIDIA NVCR using the following command:

docker login nvcr.io

2 - Login to Docker using the following command:

docker login

3 - Build and run the image using the following command:

source 1_run.sh

When done it should look like this:

Run 2_traverse_unzipped_SORDI.ipynb

It walks through the unzipped SORDI files. It opens a sqlite database. For each frame and annotation it creates an entry into the FRAMES table.

Inside the terminal run:

jupyter notebook

Open the provided URL in a browser and run 2_traverse_unzipped_SORDI.ipynb

You find the created sqlite database in the workspace folder. Check its entries via:

sqlite SORDI.sqlite
.tables 
select * from FRAME limit 10;

Feel free to create additional table entries like:

Amount of objects in frame
Overlap/Pixeloverlap of objects in frame
Uncertainty estimation
Single class or multiclass

Run 3_create_coco_annotation.ipynb

Run the notebook to create the COCO annotation file.

The outcome is the file:

sordi.coco

This is a great place to filter the training dataset in a smart manner. E.g. choose multiclass training frames with a certain object overlap only. By now, this notebook does not filter at all but exports all data found in the database.

Run 4_run_DALI_coco_pipeline.ipynb

Ready to run the pipeline? Lets go. NVIDIA DALI does image decompression and augmentations on the GPU. Since the annotation file can get larger, the initial loading and parsing takes a moment.

Acknowledgments

Adolf Hohl
Ziad Saoud, BMW Group TechOffice MUNICH
Chafic Abou Akar, BMW Group TechOffice MUNICH

About