Kamil Kaczmarczyk September 20th, 2017
This is a repository of a capstone project of Machine Learning NanoDegree from Udacity.
Goal of the work is to detect and localize aircraft and birds in images or frames of a video that from aircraft point of view in the air so that potential hazards and obstacles in the air could be avoided.
Full detailed report is provided in the capstone_report.md available also in a .pdf format.
Code is written in Python using TensorFlow library for Machine Learning.
It is divided into three chapters:
- Chapter 1: Data pre-processing - Capstone Part 01 - Dataset Preparation & Exploration.ipynb
- Chapter 2: Approach 1 implementation based on Support Vector Machines and Histogram of Oriented Gradients - Capstone Part 02 - Apply SVM.ipynb
- Chapter 3: Approach 2 implementation based on Transfer Learning from Convolutional Neural Network AlexNet - Capstone Part 03 - Apply CNN with Transfer Learning from AlexNet.ipynb
It is required for the Chapter 3 work for Transfer Learning to use pre-trained model weights which can be accessed here
Dataset used for this project consists of pictures downloaded from the popular web search engine google in the section of images. It contains and is divided into four distinct classes of pictures:
- aircraft pictures of Boeing 737 and Cessna 172 models in flight - 400 images
- birds pictures also in flight mostly on the background of sky - 367 images
- sky images containing either a clear sky or clouded and ocluded sky images - 407 images
- ground images containing a mix of various pictures from flight of cities, fields, mountains and other landscapes where most of the image area is covered by ground so that it does not contain a lot of sky in it - 407 images
Sample of dataset images are available here
The models results on test images - accuracies are:
- Approach 1: 0.8805 for the Support Vector Machines based on Histogram of Orientation Gradients features
- Approach 2: 0.9916 for the Convolutional Neural Networks based on AlexNet and Transfer Learning from ImageNet dataset
For approach 2 with CNN Transfer Learning based on AlexNet sample on test images of detection and classification as well as final precision and recall of aircraft and bird class are presented below.
Approach 2 - Transfer Learning applied on AlexNet CNN on New Images
Confusion Matrix
Aircraft (actual) | Bird (actual) | Non-Aircraft & Non-Bird (actual) | |
---|---|---|---|
Aircraft (predicted) | 11 | 0 | 0 |
Bird (predicted) | 0 | 3 | 1 |
Non-Aircraft & Non-Bird (predicted) | 0 | 3 | X |
Precision & Recall
Aircraft | Bird | |
---|---|---|
Precision | 1 | 0,75 |
Recall | 1 | 0,5 |
Object Detection & Localization
Column 1 | Column 2 | Column 2 | Column 4 | Column 5 |
---|---|---|---|---|
Original test image | Aircraft detections bounding boxes | aircraft detections heatmap - thresholded | birds detections bounding boxes | birds detections heatmap - thresholded |