RiceD2KLab / Audubon_F21

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TO BE UPDATED SOON!

The current codebase will be updated soon to reflect the results found in the following paper:

Deep object detection for waterbird monitoring using aerial imagery
Krish Kabra*,1, Alexander Xiong*,1, Wenbin Li*,1, Minxuan Luo1, William Lu1, Raul Garcia1, Dhananjay Vijay1, Jiahui Yu1, Maojie Tang1, Tianjiao Yu1, Hank Arnold2, Anna Vallery2, Richard Gibbons3, Arko Barman1
* equal contribution
1Rice University, Houston, TX 77005, USA
2Houston Audubon Society, Houston, TX 77079, USA
3American Bird Conservancy, The Plains, VA 20198, USA

Stay tuned!


Team Audubon

Development of Machine Learning Algorithms for Precision Waterbird Monitoring


Table of Contents

Table of Contents
  1. ➤ About The Project
  2. ➤ Prerequisites
  3. ➤ Folder Structure
  4. ➤ Installation & Usage Instructions
  5. ➤ Dataset
  6. ➤ Preprocessing
  7. ➤ Results and Discussion
  8. ➤ References
  9. ➤ Contributors

-----------------------------------------------------

About The Project

In order to both improve the accuracy of bird counts as well as the speed, Houston Audubon and students from the D2K capstone course at Rice University develop machine learning and computer vision algorithms for the detection of birds using images from UAVs, with the specific goals to:

  1. Count and survey the number of birds.
  2. Identify different species of detected birds.

-----------------------------------------------------

Prerequisites

made-with-python

The following open source packages are used in this project:

  • Numpy
  • Pandas
  • Matplotlib
  • OpenCV
  • Detectron2
  • WAndB

-----------------------------------------------------

Folder Structure

code
.
├── configs
├────── (useful sweep config files for WAndB)
├── scripts
├────── data_exploration.py
├── utils
├────── config.py
├────── cropping.py
├────── dataloader.py
├────── evaluation.py
├────── plotting.py
├────── trainer.py
├── README.md
├── requirements.txt
├── data_exploration.py  
├── Audubon-Bird-Detection-Tutorial.ipynb
├── train_net.py
├── wandb_train_net.py

-----------------------------------------------------

Installation & Usage Instructions

  1. Clone the repository
  2. git clone https://github.com/RiceD2KLab/Audubon_F21.git
    
  3. Install Pytorch
  4. Installation instructions here
    Requirements: Linux or macOS with Python ≥ 3.6
    pip3 install torch==1.10.0+cu102 torchvision==0.11.1+cu102 -f https://download.pytorch.org/whl/cu102/torch_stable.html
    
  5. Install Detectron2
  6. Installation instructions here
    Requirements: Linux or macOS with Python ≥ 3.6
    For Windows: Detectron2 is continuously built on Windows with CircleCI. However, official support for it is not provided.
    python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
    # (add --user if you don't have permission)
    
    # Or, to install it from a local clone:
    git clone https://github.com/facebookresearch/detectron2.git
    python -m pip install -e detectron2
    
    # On macOS, you may need to prepend the above commands with a few environment variables:
    CC=clang CXX=clang++ ARCHFLAGS="-arch x86_64" python -m pip install ...
    
  7. Install other dependencies
  8. pip install requirements.txt
    
  9. Execute the scripts as required.
  10. Open In Colab

    See train_net.py, wandb_train_net.py, or Colab Notebook for usage of code.

-----------------------------------------------------

Data Science Pipeline

-----------------------------------------------------

Dataset

Houston Audubon has provided us a 52 GB image dataset consisting of images captured using DJI M300RTK UAV with a P1 camera attachment. The images are typically 8192 x 5460 high-resolution images. The dataset contains 3 GB annotated images with corresponding CSV files for each image specifying species labels and bounding box locations. The annotated dataset features 19276 birds of 15 species, and the remaining 50.5 GB are raw images without annotations. The CSV files contain:

  • species id: unique species id in integer
  • species label: species label in words
  • x: x min of a bounding box
  • y: y min of a bounding box
  • width: width of a bounding box
  • height: height of a bounding box

-----------------------------------------------------

Preprocessing

The data wrangling module of the pipeline largely involves preparing the data to be fed into deep learning models used to detect objects, namely birds. Our data wrangling process includes:

  1. Tiling
  2. Data Augmentation

-----------------------------------------------------

Tiling

Principally, deep learning models train faster and have better performances on smaller images. For instance, 600 × 600 pixels is usually an ideal image size for typical object detection deep learning models. Therefore, our first attempt was to split the 8192 × 5460 images into tiles. The size of generated images can be specified by setting parameters and is default to be 600 × 600.

A caveat of this approach is that unavoidably some birds will be cut into two parts and appear in two neighboring patches, as seen in Figure 2. In addition, as counting the number of birds is among the objectives, the same problem needs to be tackled in the detection phase as well. In this case, only the generated image with over 50% fraction of the cropped bird keeps the bounding box, while the remaining fraction of the bounding box in another image is discarded. This means that we are training the model to detect both complete birds and partial birds.

In the detection stage, we will also try to come up with a proper merging mechanism to merge partial detection in neighboring patches and count as one if repeated counting is a common pattern in detection.

-----------------------------------------------------

Data Augmentation

Deep learning models are effective with about 1,000 images per class, but some bird species do not have abundant training samples in our dataset. Our team plans to make deep learning models more robust via data augmentation, which means training models with synthetically modified data:

  • rotation: Orthogonal or non-orthogonal rotations. Rotation is a natural data augmentation step for our data at hand because the bird images are taken from different angles by drones.
  • random crop: Randomly sample a section from the image and resize it to the original image size.

These data augmentation steps help models adapt to different orientations, locations, and scales of the same object class, and will boost the performance of the models.

We utilized the imgaug library to generate modified images. We have tried several types of augmentations: flipping, blurring, adding Gaussian noise and changing color contrasts.

For the time being, our model is only trained on original data. We plan to retrain our model on the augmented dataset and compare performances. We are generating a larger training set using the augmentation methods mentioned above. Specifically, both the original images and the transformed images will be fed to the model in the training phase, but only original images will be used for evaluation and testing purposes.

-----------------------------------------------------

Experiments

We utilize a RetinaNet and Faster R-CNN module both with a ResNet-50-FPN backbone. We first train our model to perform the simple task of detecting birds with no distinction of species. We then train the model to identify bird species: namely, Brown Pelicans, Laughing Gulls, Mixed Terns, Great Blue Herons, and Great Egrets/White Morphs.

Due to the lack of annotated data available for other bird species, we re-label all other bird species under the "Other/Unknown" category.

Note: The model weights used to initialize both the bird-only and bird-species detector come from a pre-trained model on the MS COCO dataset.

  1. Bird-only detector (RetinaNet ResNet-50 FPN)
  2. Birds
    AP (IoU = 0.5) 93.7%
    AP (IoU = 0.75) 26.4%
    mAP 43.7%

    The high AP of 93.7% using an IoU threshold of 0.50 is very promising.

    The mAP of 43.7% is comparableto the state-of-the-art results for challenging object detection tasks such as on the COCO dataset.

  3. Bird species detector (Faster R-CNN ResNet-50 FPN)
  4. Brown Pelican Laughing Gull Mixed Tern Great Blue Heron Great Egret/White Morph Other/Unknown Overall
    AP (IoU = 0.5) 98.8% 100.0% 97.6% 98.5% 96.9% 0.0% 82.0%

    The higher AP for all bird species using an IoU threshold of 0.50 in comparison to the bird-only detector is excellent, except for the “Other/Unknown” categroy, where the model drastically fails to classify. Nevertheless, we can combine the results from a bird-only detector and bird-species detector to recover the poor performance of the "Other/Unknown" bird category.

-----------------------------------------------------

Contributors

Krish Kabra
      Email: krish.kabra@rice.edu
      GitHub: @krishk97

Minxuan Luo
      Email: ml122@rice.edu
      GitHub: @minxuanluo

Alexander Xiong
      Email: xionga27@rice.edu
      GitHub: @awx1

William Lu wyl1@rice.edu
      Email:
      GitHub: @

Anna Vallery
      Email: avallery@houstonaudubon.org

Richard Gibbons Lu
      Email: rgibbons@houstonaudubon.org

Hank Arnold
      Email: hmarnold@msn.com


This was the project for the course COMP 449/549 - Machine Learning and Data Science Projects (Fall 2021), at Rice University

About


Languages

Language:Python 61.9%Language:Jupyter Notebook 38.1%