Keras RetinaNet

Keras implementation of RetinaNet object detection as described in this paper by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.

Training

An example on how to train keras-retinanet can be found here.

Usage

For training on Pascal VOC, run:

python examples/train_pascal.py <path to VOCdevkit/VOC2007>

For training on MS COCO, run:

python examples/train_coco.py <path to MS COCO>

In general, the steps to train on your own datasets are:

Create a model by calling keras_retinanet.models.ResNet50RetinaNet and compile it. Empirically, the following compile arguments have been found to work well:

model.compile(
    loss={
        'regression'    : keras_retinanet.losses.regression_loss,
        'classification': keras_retinanet.losses.focal_loss()
    },
    optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001)
)

Create generators for training and testing data (an example is show in keras_retinanet.preprocessing.PascalVocIterator). These generators should generate an image batch (shaped (batch_id, height, width, channels)) and a target batch (shaped (batch_id, num_anchors, 5)). Currently, a limitation is that batch_size must be equal to 1.
Use model.fit_generator to start training.

Testing

An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:

_, _, detections = model.predict_on_batch(inputs)

Where detections are the resulting detections, shaped (None, None, 4 + num_classes) (for (x1, y1, x2, y2, bg, cls1, cls2, ...)).

Execution time on NVIDIA Pascal Titan X is roughly 55msec for an image of shape 1000x600x3.

Results

MS COCO

The MS COCO model can be downloaded here. Results using the cocoapi are shown below (note: according to the paper, this configuration should achieve a mAP of 0.34).

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.306
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.485
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.323
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.131
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.336
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.439
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.274
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.410
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.422
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.217
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.467
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.591

Status

The examples show how to train keras-retinanet on Pascal VOC and MS COCO. Example output images are shown below.

Todo's

Allow batch_size > 1.
Compare result w.r.t. paper results.
Configure CI

Notes

This repository is tested on Keras version 2.0.8, but should also work on 2.0.7.
This repository is tested using OpenCV 3.3 (3.0+ should be supported).

Contributions to this project are welcome.

Discussions

Feel free to join the #keras-retinanet Keras Slack channel for discussions and questions.

bittdy / keras-retinanet