TO BE UPDATED SOON!

The current codebase will be updated soon to reflect the results found in the following paper:

Deep object detection for waterbird monitoring using aerial imagery
Krish Kabra^*,1, Alexander Xiong^*,1, Wenbin Li^*,1, Minxuan Luo¹, William Lu¹, Raul Garcia¹, Dhananjay Vijay¹, Jiahui Yu¹, Maojie Tang¹, Tianjiao Yu¹, Hank Arnold², Anna Vallery², Richard Gibbons³, Arko Barman¹
^* equal contribution
¹Rice University, Houston, TX 77005, USA
²Houston Audubon Society, Houston, TX 77079, USA
³American Bird Conservancy, The Plains, VA 20198, USA

Stay tuned!

Team Audubon

Development of Machine Learning Algorithms for Precision Waterbird Monitoring

Table of Contents

➤ About The Project
➤ Prerequisites
➤ Folder Structure
➤ Installation & Usage Instructions
➤ Dataset
➤ Preprocessing
- Tiling
- Data Augmentation
➤ Results and Discussion
➤ References
➤ Contributors

About The Project

In order to both improve the accuracy of bird counts as well as the speed, Houston Audubon and students from the D2K capstone course at Rice University develop machine learning and computer vision algorithms for the detection of birds using images from UAVs, with the specific goals to:

Count and survey the number of birds.
Identify different species of detected birds.

Prerequisites

The following open source packages are used in this project:

Numpy
Pandas
Matplotlib
OpenCV
Detectron2
WAndB

Folder Structure

code
.
├── configs
├────── (useful sweep config files for WAndB)
├── scripts
├────── data_exploration.py
├── utils
├────── config.py
├────── cropping.py
├────── dataloader.py
├────── evaluation.py
├────── plotting.py
├────── trainer.py
├── README.md
├── requirements.txt
├── data_exploration.py  
├── Audubon-Bird-Detection-Tutorial.ipynb
├── train_net.py
├── wandb_train_net.py

Installation & Usage Instructions

Clone the repository

git clone https://github.com/RiceD2KLab/Audubon_F21.git

Install Pytorch

Installation instructions here

pip3 install torch==1.10.0+cu102 torchvision==0.11.1+cu102 -f https://download.pytorch.org/whl/cu102/torch_stable.html

Install Detectron2

Installation instructions here

python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
# (add --user if you don't have permission)

# Or, to install it from a local clone:
git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2

# On macOS, you may need to prepend the above commands with a few environment variables:
CC=clang CXX=clang++ ARCHFLAGS="-arch x86_64" python -m pip install ...

Install other dependencies

pip install requirements.txt

Execute the scripts as required.

See train_net.py, wandb_train_net.py, or Colab Notebook for usage of code.

Data Science Pipeline

Dataset

Houston Audubon has provided us a 52 GB image dataset consisting of images captured using DJI M300RTK UAV with a P1 camera attachment. The images are typically 8192 x 5460 high-resolution images. The dataset contains 3 GB annotated images with corresponding CSV files for each image specifying species labels and bounding box locations. The annotated dataset features 19276 birds of 15 species, and the remaining 50.5 GB are raw images without annotations. The CSV files contain:

species id: unique species id in integer
species label: species label in words
x: x min of a bounding box
y: y min of a bounding box
width: width of a bounding box
height: height of a bounding box

Preprocessing

The data wrangling module of the pipeline largely involves preparing the data to be fed into deep learning models used to detect objects, namely birds. Our data wrangling process includes:

Tiling
Data Augmentation

Tiling

Principally, deep learning models train faster and have better performances on smaller images. For instance, 600 × 600 pixels is usually an ideal image size for typical object detection deep learning models. Therefore, our first attempt was to split the 8192 × 5460 images into tiles. The size of generated images can be specified by setting parameters and is default to be 600 × 600.

A caveat of this approach is that unavoidably some birds will be cut into two parts and appear in two neighboring patches, as seen in Figure 2. In addition, as counting the number of birds is among the objectives, the same problem needs to be tackled in the detection phase as well. In this case, only the generated image with over 50% fraction of the cropped bird keeps the bounding box, while the remaining fraction of the bounding box in another image is discarded. This means that we are training the model to detect both complete birds and partial birds.

In the detection stage, we will also try to come up with a proper merging mechanism to merge partial detection in neighboring patches and count as one if repeated counting is a common pattern in detection.

Data Augmentation

Deep learning models are effective with about 1,000 images per class, but some bird species do not have abundant training samples in our dataset. Our team plans to make deep learning models more robust via data augmentation, which means training models with synthetically modified data:

rotation: Orthogonal or non-orthogonal rotations. Rotation is a natural data augmentation step for our data at hand because the bird images are taken from different angles by drones.
random crop: Randomly sample a section from the image and resize it to the original image size.

These data augmentation steps help models adapt to different orientations, locations, and scales of the same object class, and will boost the performance of the models.

We utilized the imgaug library to generate modified images. We have tried several types of augmentations: flipping, blurring, adding Gaussian noise and changing color contrasts.

For the time being, our model is only trained on original data. We plan to retrain our model on the augmented dataset and compare performances. We are generating a larger training set using the augmentation methods mentioned above. Specifically, both the original images and the transformed images will be fed to the model in the training phase, but only original images will be used for evaluation and testing purposes.

Experiments

We utilize a RetinaNet and Faster R-CNN module both with a ResNet-50-FPN backbone. We first train our model to perform the simple task of detecting birds with no distinction of species. We then train the model to identify bird species: namely, Brown Pelicans, Laughing Gulls, Mixed Terns, Great Blue Herons, and Great Egrets/White Morphs.

Due to the lack of annotated data available for other bird species, we re-label all other bird species under the "Other/Unknown" category.

Note: The model weights used to initialize both the bird-only and bird-species detector come from a pre-trained model on the MS COCO dataset.

Bird-only detector (RetinaNet ResNet-50 FPN)

	Birds
AP (IoU = 0.5)	93.7%
AP (IoU = 0.75)	26.4%
mAP	43.7%

The high AP of 93.7% using an IoU threshold of 0.50 is very promising.

The mAP of 43.7% is comparableto the state-of-the-art results for challenging object detection tasks such as on the COCO dataset.

Bird species detector (Faster R-CNN ResNet-50 FPN)

	Brown Pelican	Laughing Gull	Mixed Tern	Great Blue Heron	Great Egret/White Morph	Other/Unknown	Overall
AP (IoU = 0.5)	98.8%	100.0%	97.6%	98.5%	96.9%	0.0%	82.0%

The higher AP for all bird species using an IoU threshold of 0.50 in comparison to the bird-only detector is excellent, except for the “Other/Unknown” categroy, where the model drastically fails to classify. Nevertheless, we can combine the results from a bird-only detector and bird-species detector to recover the poor performance of the "Other/Unknown" bird category.

Contributors

Krish Kabra
Email: krish.kabra@rice.edu
GitHub: @krishk97

Minxuan Luo
Email: ml122@rice.edu
GitHub: @minxuanluo

Alexander Xiong
Email: xionga27@rice.edu
GitHub: @awx1

William Lu wyl1@rice.edu
Email:
GitHub: @

Anna Vallery
Email: avallery@houstonaudubon.org

Richard Gibbons Lu
Email: rgibbons@houstonaudubon.org

Hank Arnold
Email: hmarnold@msn.com

✤ This was the project for the course COMP 449/549 - Machine Learning and Data Science Projects (Fall 2021), at Rice University

RiceD2KLab / Audubon_F21