theamachine / Deep-Convolutional-Neural-Network-Architecture-for-Plant-Seedling-Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep-Convolutional-Neural-Network-Architecture-for-Plant-Seedling-Classification

Introduction

In this project, you will apply your knowledge of CNNs we want to estimate the growth stage of weeds using the number of leaves of the plant. The more leaves, the more the weed has grown.

The purpose of this assignment is to gain experience building and training neural networks. You will gain:

  • More experience training CNNs
  • Experience with problem reformulation
  • Experience with techniques for improving results (Regularization, Data Augmentation)

You must use Keras with the Tensorflow backend, i.e., the package tensorflow.keras. For this assignment, you may use other tensorflow packages and scikit-learn, scikit-image or pandas but not other deep learning frameworks, e.g., pytorch, mxnet etc.

Part 0. Data Preparation

The data for this project are plant images at different resolutions captured with a variety of cameras. There are images showing plants with approximatelty 1,2,3,4 and 6 leafs. The images are part of a Leaf counting dataset by Teimouri et al. [1] which can be downloaded from the Aarhus University, Denmark:

Leaf counting dataset (Required files are posted on Brightspace)

There are 200 images for each of the 5 classes. As Figure 1 shows, there is a great variety of plants and image conditions. The dataset is split into a training and a testing set where there are 180 images per class for training and validation; and 20 images for testing.

In this section:

  • Download the dataset as described above
  • Use the splits provided in the Brightspace files
  • Visualize five images from the dataset.
## Part 1a. Transfer Learning - Classification Network For this project, Keras implementation of VGG-16 is used as a starting point.

Using the first 2 blocks of VGG-16 and adding extra Keras layers to create your own version of a CNN network for the classification of the images according to the number of leaves in the plant images. Note that there will be 5 classes. The last layer from VGG-16 will be block2 pool and you are allowed to add no more than five fully connected or convolutional layers to the network including the final output layer.

  • You can use as many pooling, flattening, 1 × 1 convolution layers, etc. as you wish but do not use any regularization.
  • Train this simple network on the training set while monitoring convergence on the validation set.
  • As input to the model use images of size no larger than 128×128.

Note, it is highly recommended to use even smaller input images to try things out. You are not expected to fine-tune the initial VGG layers.

When your classifier is working:

  • Plot a loss curve for training and validation data
  • Plot an accuracy curve for training and validation data
  • Provide confusion matrix of your network on the training including validation and testing data sets.

Part 1b. Transfer Learning - Regression Reformulation

Step 1

Repeat the steps of Part 1a. but reformulate as a regression problem, i.e., your network needs to output a single float value ranging between 0 to 6 corresponding to the number of leaves. Again, you are not expected to fine-tune the initial VGG layers.

  • Plot a loss curve for training and validation data
  • Plot an accuracy curve for training and validation data
  • Provide confusion matrix of your network on the training including validation and testing data sets.

Step 2

The size of the training data is quite small. Discuss based on your learning curves if overfitting is occurring with the models from Parts 1a and 1b.

Part 2. Improve your Model

Regularization and data augmentation are common strategies to deal with small datasets.

Step 1

Incorporate Batch Normalization and Dropout into your design the superior network trained in Part 1. You are not expected to fine-tune the initial VGG layers. Again you will provide the following:

  • A loss curve for training and validation data
  • An accuracy curve for training and validation data
  • A confusion matrix of your network on the training including validation and testing data sets.

Step 2

Train the same model from Step 1, now using data augmentation. Again, please provide the same output metrics as in Step 1.

Step 3

Discussion based on the learning curves and final metrics in Step 2, how large a improvement can be observed from regularization and data augmentation.

About

License:MIT License


Languages

Language:Jupyter Notebook 100.0%