5m0k3 / gwd-yolov5-pytorch

My modified version of YoloV5 training, cross-validation and inference with Pseudo Labelling pytorch pipelines used in GWD Kaggle Competition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pytorch based YoloV5 solution - Global Wheat Detection

A complete pytorch pipeline for training, cross-validation and inference notebooks used in Kaggle competition Global Wheat Detection (May-Aug 2020)

Table of Contents

Brief overview of the competition images

Wheat heads were from various sources:
head
A few labeled images are as shown: (Blue bounding boxes)
head head

Notebooks description

A brief content description is provided here, for detailed descriptions check the notebook comments

[TRAIN] notebook

  1. Pre-Processing:
    - Handled the noisy labels (too big/small boxes etc.)
    - Stratified 5 fold split based on source

  2. Augmentations:
    - Albumentations - RandomSizedCrop, HueSaturationValue, RandomBrightnessContrast, RandomRotate90, Flip, Cutout, ShiftScaleRotate
    - Mixup - https://arxiv.org/pdf/1710.09412.pdf
    2 images are mixed
    head
    - Mosaic - https://arxiv.org/pdf/2004.12432.pdf
    4 images are cropped and stitched together. YoloV5 by default has a canvas where it stitches images in size multiple of 32 pixels. For batch size = 4 the canvas looks like:
    head
    for batch size = 2
    head

  3. Configurations:
    - Default YoloV5 configuration

  4. TensorBoard Analysis:
    - YoloV5 by default uses TensorBoard during training, the best model is selected using "fitness" criteria based on following parameters:
    head
    head
    Some of my TensorBoard training logs can be found at TensorBoard.dev

[CV] Cross Validation notebook

  1. Pre-Processing:
    - Same as in [TRAIN]

  2. Test Time Augmentations:
    - Flips and Rotate
    head
    - Color shift
    - Scale (scale down with padding)

  3. Ensemble:
    - Support for ensembling of multiple folds of the same model
    - Non-Maximum Supression (NMS) is used to ensemble final predicted boxes

  4. Automated Threshold Calculations:
    - Confidence level threshold is calculated based on ground truth labels
    - Optimal Final CV score (Metric: IoU) is obtained through this

[INFERENCE] Submission notebook

  1. Test Time Augmentations:
    - Same as in [CV]

  2. Pseudo Labelling:
    - Multi-Round Pseudo Labelling pipeline based on https://arxiv.org/pdf/1908.02983.pdf
    - Implemented Cross Validation calculations at the end of each round to decide the best thresholds for Pseudo Labels in the next round
    - Training pipeline same as in [TRAIN]
    head

  3. Post-Processing and Result:
    - Final predictions made with ensembled combinations of TTA

How to use

Just change the directories according to your environment.

Google Colab deployed versions are available for
[TRAIN] Open In Colab
[CV] Open In Colab

In case of any deprecation issues/warnings in future, use the modules available in YoloV5-Mixup folder.

Improvements

Acknowledging the shortcomings is the first step for progress. Thus, listing the possible improvements that could've made my Model better:

  • Ensemble Multi-Model/Fold predictions for Pseudo Labels, currently single model is used to make pseudo labels. Would've made the model more robust to noise too.
  • GAN or Style Transfer could've been used to produce more similar labeled images from the current train images for better generalization.
  • Relabeling of noisy labels using multi-folds.

About

My modified version of YoloV5 training, cross-validation and inference with Pseudo Labelling pytorch pipelines used in GWD Kaggle Competition


Languages

Language:Jupyter Notebook 96.1%Language:Python 3.8%Language:Dockerfile 0.0%Language:Shell 0.0%