hyaihjq / YOLOv4-PyTorch

Pytorch implements yolov4.Good performance, easy to use, fast speed.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

YOLOv4-PyTorch

Overview

The inspiration for this project comes from ultralytics/yolov3 && AlexeyAB/darknet Thanks.

This project is a YOLOv4 object detection system. Development framework by PyTorch.

The goal of this implementation is to be simple, highly extensible, and easy to integrate into your own projects. This implementation is a work in progress -- new features are currently being implemented.

Table of contents

  1. About YOLOv4
  2. Installation
  3. Usage
  4. Train on Custom Dataset
  5. Credit

About YOLOv4

There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100. Source code is at this https URL.

Installation

Clone and install requirements

git clone https://github.com/Lornatang/YOLOv4-PyTorch.git
cd YOLOv4-PyTorch/
pip install -r requirements.txt

Download pre-trained weights

cd weights/
bash download_weights.sh

Download PascalVoc2007

cd data/
bash get_voc_dataset.sh

Download COCO2014

cd data/
bash get_coco2014_dataset.sh

Download COCO2017

cd data/
bash get_coco2017_dataset.sh

Usage

Train

  • Example (COCO2017)

To train on COCO2014/COCO2017 run:

python train.py --config-file configs/COCO-Detection/yolov5-small.yaml --data data/coco2017.yaml --weights ""
  • Example (VOC2007+2012)

To train on VOC07+12 run:

python train.py --config-file configs/PascalVOC-Detection/yolov5-small.yaml --data data/voc2007.yaml --weights ""
  • Other training methods

Normal Training: python train.py --config-file configs/COCO-Detection/yolov5-small.yaml --data data/coco2014.yaml --weights "" to begin training after downloading COCO data with data/get_coco2014_dataset.sh. Each epoch trains on 117,263 images from the train and validate COCO sets, and tests on 5000 images from the COCO validate set.

Resume Training: python train.py --config-file configs/COCO-Detection/yolov5-small.yaml --data data/coco2014.yaml --resume to resume training from weights/checkpoint.pth.

Test

All numbers were obtained on local machine servers with 2 NVIDIA GeForce RTX 2080 SUPER GPUs & NVLink. The software in use were PyTorch 1.5.1, CUDA 10.2, cuDNN 7.6.5.

  • Example (COCO2017)

To train on COCO2014/COCO2017 run:

python test.py --config-file configs/COCO-Detection/yolov5-small.yaml --data data/coco2017.yaml --weights weights/COCO-Detection/yolov5-small.pth
  • Example (VOC2007+2012)

To train on VOC07+12 run:

python test.py --config-file configs/PascalVOC-Detection/yolov5-small.yaml --data data/voc2007.yaml --weights weights/PascalVOC-Detection/yolov5-small.pth

Common Settings for VOC Models

  • All VOC models were trained on voc2007_trainval + voc2012_trainval and evaluated on voc2007_test.
  • The default settings are not directly comparable with YOLOv4's standard settings. The default settings are not directly comparable with Detectron's standard settings. For example, our default training data augmentation uses scale jittering in addition to horizontal flipping.
  • For YOLOv3/YOLOv4, we provide baselines based on 2 different backbone combinations:
    • Darknet-53: Use a ResNet+VGG backbone with standard conv and FC heads for mask and box prediction, respectively.
    • CSPDarknet-53: Use a ResNet+CSPNet backbone with standard conv and FC heads for mask and box prediction, respectively. It obtains the best speed/accuracy tradeoff, but the other two are still useful for research.
Pascal VOC Object Detection Baselines
Model train
time
(s/iter)
inference
time
(ms/im)
train
mem
(GB)
APtest AP50 fps params FLOPs download
MobileNet-v1 9.6 2.5 4.0 31.2 61.3 400 4.95M 11.3B model
VGG16 - - - - - - - - -
YOLOv3-Tiny 10.5 1.5 3.7 24.3 53.1 667 7.96M 10.5B model
YOLOv3 2.4 6.7 6.4 57.9 82.6 149 61.79M 155.6B model
YOLOv3-SPP 2.4 6.7 6.3 59.7 83.3 149 62.84M 156.5B model
YOLOv4-Tiny 12.3 1.5 2.7 20.0 46.0 667 3.10M 6.5B model
YOLOv4 2.1 7.5 6.7 61.4 83.7 133 60.52M 131.6B model
YOLOv5-small 5.4 2.3 1.7 49.3 75.9 435 7.31M 17.0B model
YOLOv5-medium 3.6 3.8 3.1 56.5 80.3 263 21.56M 51.7B model
YOLOv5-large 2.8 6.1 5.1 59.4 81.6 164 47.50M 116.4B model
YOLOv5-xlarge 1.4 10.8 7.2 60.2 82.6 93 88.56M 220.6B model

Common Settings for COCO Models

  • All COCO models were trained on train2017 and evaluated on val2017.
  • The default settings are not directly comparable with YOLOv4's standard settings. The default settings are not directly comparable with Detectron's standard settings. For example, our default training data augmentation uses scale jittering in addition to horizontal flipping.
  • For YOLOv3/YOLOv4, we provide baselines based on 3 different backbone combinations:
    • Darknet-53: Use a ResNet+VGG backbone with standard conv and FC heads for mask and box prediction, respectively.
    • CSPDarknet-53: Use a ResNet+CSPNet backbone with standard conv and FC heads for mask and box prediction, respectively. It obtains the best speed/accuracy tradeoff, but the other two are still useful for research.
    • GhostDarknet-53 : Use a ResNet+Ghost backbone with standard conv and FC heads for mask and box prediction, respectively.
COCO Object Detection Baselines
Model train
time
(s/iter)
inference
time
(ms/im)
train
mem
(GB)
APtest AP50 fps params FLOPs download
MobileNet-v1 - - - - - - - - -
VGG16 - - - - - - - - -
YOLOv3-Tiny - - - - - - - - -
YOLOv3 - - - - - - - - -
YOLOv3-SPP - - - - - - - - -
YOLOv4 - - - - - - - - -
YOLOv4-Tiny - - - - - - - - -
YOLOv5-small - - - - - - - - -
YOLOv5-medium - - - - - - - - -
YOLOv5-large - - - - - - - - -
YOLOv5-xlarge - - - - - - - - -

Inference

detect.py runs inference on any sources:

python detect.py --cfg configs/COCO-Detection/yolov5-small.yaml  --data data/coco2014.yaml --weights weights/COCO-Detection/yolov5-small.pth  --source ...
  • Image: --source file.jpg
  • Video: --source file.mp4
  • Directory: --source dir/
  • Webcam: --source 0
  • HTTP stream: --source https://v.qq.com/x/page/x30366izba3.html

Train on Custom Dataset

Run the commands below to create a custom model definition, replacing your-dataset-num-classes with the number of classes in your dataset.

# move to configs dir
cd configs/
create custom model 'yolov3-custom.yaml'. (In fact, it is OK to modify two lines of parameters, see `create_model.sh`)                              
bash create_model.sh your-dataset-num-classes

Data configuration

Add class names to data/custom.yaml. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/val.txt, add paths to images that will be used as train and validation data respectively.

Training

To train on the custom dataset run:

python train.py --config-file configs/yolov3-custom.yaml --data data/custom.yaml --epochs 100 

Credit

YOLOv4: Optimal Speed and Accuracy of Object Detection

Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao

Abstract
There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100. Source code is at this https URL.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov4,
  title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
  author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  journal = {arXiv},
  year={2020}
}

About

Pytorch implements yolov4.Good performance, easy to use, fast speed.

License:Apache License 2.0


Languages

Language:Python 93.5%Language:Shell 5.6%Language:HTML 0.9%