MobileNet-SSD and MobileNetV2-SSD/SSDLite with PyTorch

Object Detection with MobileNet-SSD, MobileNetV2-SSD/SSDLite on VOC, BDD100K Datasets.

Results

Detection

View the result on Youtube

Dependencies

Python 3.6+
OpenCV
PyTorch
Pyenv (optional)
tensorboard
tqdm

Dataset Path (optional)

The dataset path should be structured as follow:

$ pip install --user kaggle
$ kaggle datasets download solesensei/solesensei_bdd100k

# Unzip downloaded zip file
# follow instructions to conduct the directory structure as below.
|- bdd100k -- bdd100k -- images -- 100k -- train -- (70000 images)
|               |                        |- val -- (10000 images)
|               |
|               |- labels -- (.json)
|               |
|               |- xml -- train -- (.xml)
|               |- val -- (.xml)
|
|- pytorch-ssd - data -- VOCdevkit -- test -- VOC2007 -- (Annotations, ImageSets, JPEGImages,...)
     (our repo)   |                    |- VOC2007 -- (Annotations, ImageSets, JPEGImages,...)
                  |
                  |- bdd_files
                  |- images
                  |- models
                  |- ...
                  |- train_ssd_BDD.py
                  |- ssd_test_img.py
                  |- ...
@TranLeAnh

$ mkdir bdd100k
$ mv solesensei_bdd100k.zip bdd100k
$ cd bdd100k
$ unzip solesensei_bdd100k.zip

# under 100k/train, some images are splitted into testA, testB, trainA, trainB.
# extract them out.
$ mv bdd100k/bdd100k/image bdd100k/

$ mv bdd100k/images/100k/train/testA/* bdd100k/images/100k/train/
$ mv bdd100k/images/100k/train/testB/* bdd100k/images/100k/train/
$ mv bdd100k/images/100k/train/trainA/* bdd100k/images/100k/train/
$ mv bdd100k/images/100k/train/trainB/* bdd100k/images/100k/train/

# bdd100k/bdd100k/labels is bdd100k/bdd100-_labels_release/bdd100k/labels
# Change location of it to satisfy above directory structure.
$ cp -r bdd100k_labels_release/bdd100k/labels/ ./bdd100k/

Pre-processing

Convert BDD100K anotation format (.json) to VOC anotation format (.xml)

$ python bdd2voc.py

Remove training samples having no anotation (70000 to 69863)

$ python remove_nolabel_data.py

Download Pre-trained Models (VOC)

MobileNet-SSD

$ wget -P models https://storage.googleapis.com/models-hao/mobilenet-v1-ssd-mp-0_675.pth

MobileNetV2-SSDLite

$ wget -P models https://storage.googleapis.com/models-hao/mb2-ssd-lite-mp-0_686.pth

Train

Train MobileNet-SSD (VOC)

$ python train_ssd_VOC.py --datasets ~/data/VOCdevkit/VOC2007/ --validation_dataset ~/data/VOCdevkit/test/VOC2007/ --net mb1-ssd --batch_size 24 --num_epochs 100 --scheduler cosine --lr 0.01 --t_max 200

Train MobileNet-SSD (BDD100K)

$ python train_ssd_BDD.py --datasets ../bdd100k/bdd100k/images/100k/train/ --validation_dataset ../bdd100k/bdd100k/images/100k/val/ --net mb1-ssd --batch_size 48 --num_epochs 200 --scheduler cosine --lr 0.01 --t_max 200

Resume training a trained model (BDD100K)

$ python train_ssd_BDD.py --datasets ../bdd100k/bdd100k/images/100k/train/ --validation_dataset ../bdd100k/bdd100k/images/100k/val/ --net mb1-ssd --batch_size 48 --num_epochs 200 --scheduler cosine --lr 0.01 --t_max 200 --resume models/mb1-ssd-Epoch-105-Loss-inf.pth

Train pretrained-model MobileNetV2-SSDLite (BDD100K)

$ python train_ssd_BDD.py --datasets ../bdd100k/bdd100k/images/100k/train/ --validation_dataset ../bdd100k/bdd100k/images/100k/val/ --net mb2-ssd-lite --pretrained_ssd models/mb2-ssd-lite-net.pth --scheduler cosine --lr 0.01 --t_max 100 --batch_size 36 --num_epochs 200

Resume Training

$ python train_ssd_BDD.py --datasets ../bdd100k/bdd100k/images/100k/train/ --validation_dataset ../bdd100k/bdd100k/images/100k/val/ --net mb2-ssd-lite --resume models/(trained-model-name).pth --batch_size 16 --num_epochs 100 --scheduler cosine --lr 0.001 --t_max 100 --debug_steps 10

Test

Test on image

$ python ssd_test_img.py

Test on video

$ python ssd_test_video.py

Produce Detection Information

Print out text file (.txt) of dectection information

notebook: print_text_files.ipynb

File format: {image_name}.txt [ class_name confidence x1 y1 x2 y2 ]

car 0.980528 194 356 466 513
car 0.897605 752 372 975 467
car 0.414176 580 372 646 416
traffic_light 0.605162 844 178 867 225
traffic_light 0.602555 816 176 841 224
traffic_light 0.495851 109 62 141 124
truck 0.580303 167 265 485 483

References

April 2020

Tran Le Anh

tranleanh / mobilenets-ssd-pytorch

MobileNet-SSD and MobileNetV2-SSD/SSDLite with PyTorch

Results

Dependencies

Dataset Path (optional)

Pre-processing

Download Pre-trained Models (VOC)

Train

Resume Training

Test

Produce Detection Information

References

About

Languages