zjykzj / YOLOv1

You Only Look Once: Unified, Real-Time Object Detection

Home Page:https://arxiv.org/abs/1506.02640

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Language: 🇺🇸 🇨🇳

«YOLOv1» reproduced the paper "You Only Look Once"

  • Train using the VOC07+12 trainval dataset and test using the VOC2007 Test dataset with an input size of 448x448. give the result as follows
Original (darknet) Original (darknet) abeardear/pytorch-YOLO-v1 zjykzj/YOLOv1(This) zjykzj/YOLOv1(This) zjykzj/YOLOv1(This) zjykzj/YOLOv1(This)
ARCH YOLOv1 FastYOLOv1 ResNet_YOLOv1 YOLOv1(S=14) FastYOLOv1(S=14) YOLOv1 FastYOLOv1
VOC AP[IoU=0.50] 63.4 52.7 66.5 71.71 60.38 66.85 52.89

Table of Contents

Latest News

  • [2023/07/07]v0.4.0. Add ultralytics/yolov5(485da42) transforms.
    • After this update, the implementation of zjykzj/YOLOv1 has completely surpassed the training results of the paper
  • [2023/06/26]v0.3.2. Refactor data module.
  • [2023/05/16]v0.3.1. Add IGNORE_THRESH in YOLOv1Loss and reset lambda_* based on the YOLOv1 paper.
    • In this version, the test results on the VOC dataset have exceeded the paper implementation.
  • [2023/05/16]v0.3.0. Expand receptive field and use F.cross_entropy for class loss.
  • [2023/05/14]v0.2.0. Update VOC dataset training results for YOLOv1 and FastYOLOv1.

Background

YOLOv1 is the beginning of the YOLO series, which establishes the basic architecture of the YOLO target detection network. In this repository, I plan to reimplement YOLOv1 to help better understand the YOLO architecture

Prepare Data

Pascal VOC

Use this script voc2yolov5.py

python voc2yolov5.py -s /home/zj/data/voc -d /home/zj/data/voc/voc2yolov5-train -l trainval-2007 trainval-2012
python voc2yolov5.py -s /home/zj/data/voc -d /home/zj/data/voc/voc2yolov5-val -l test-2007

Then softlink the folder where the dataset is located to the specified location:

ln -s /path/to/voc /path/to/YOLOv1/../datasets/voc

Installation

Requirements

See NVIDIA/apex

Container

Development environment (Use nvidia docker container)

docker run --gpus all -it --rm -v </path/to/YOLOv1>:/app/YOLOv1 -v </path/to/voc>:/app/datasets/voc nvcr.io/nvidia/pytorch:22.08-py3

Usage

Train

  • One GPU
CUDA_VISIBLE_DEVICES=0 python main_amp.py -c configs/yolov1_s14_voc.cfg --opt-level=O1 ../datasets/voc
  • Multi-GPUs
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port "36121" main_amp.py -c configs/yolov1_s14_voc.cfg --opt-level=O1 ../datasets/voc

Eval

python eval.py -c configs/yolov1_s14_voc.cfg -ckpt outputs/yolov1_s14_voc/model_best.pth.tar ../datasets/voc
VOC07 metric? Yes
AP for aeroplane = 0.7277
AP for bicycle = 0.8156
AP for bird = 0.7018
AP for boat = 0.5847
AP for bottle = 0.4280
AP for bus = 0.7849
AP for car = 0.7739
AP for cat = 0.8371
AP for chair = 0.5432
AP for cow = 0.7970
AP for diningtable = 0.7196
AP for dog = 0.8270
AP for horse = 0.8401
AP for motorbike = 0.7996
AP for person = 0.7258
AP for pottedplant = 0.4511
AP for sheep = 0.7157
AP for sofa = 0.7383
AP for train = 0.8082
AP for tvmonitor = 0.7221
Mean AP = 0.7171
python eval.py -c configs/yolov1_voc.cfg -ckpt outputs/yolov1_voc/model_best.pth.tar ../datasets/voc
VOC07 metric? Yes
AP for aeroplane = 0.6916
AP for bicycle = 0.7539
AP for bird = 0.6359
AP for boat = 0.5363
AP for bottle = 0.3216
AP for bus = 0.7710
AP for car = 0.7297
AP for cat = 0.8380
AP for chair = 0.4568
AP for cow = 0.7125
AP for diningtable = 0.6579
AP for dog = 0.7984
AP for horse = 0.7886
AP for motorbike = 0.7398
AP for person = 0.6630
AP for pottedplant = 0.4048
AP for sheep = 0.6586
AP for sofa = 0.6916
AP for train = 0.8208
AP for tvmonitor = 0.6996
Mean AP = 0.6685
python eval.py -c configs/fastyolov1_s14_voc.cfg -ckpt outputs/fastyolov1_s14_voc/model_best.pth.tar ../datasets/voc
VOC07 metric? Yes
AP for aeroplane = 0.6090
AP for bicycle = 0.7262
AP for bird = 0.5349
AP for boat = 0.4699
AP for bottle = 0.2417
AP for bus = 0.7292
AP for car = 0.7069
AP for cat = 0.7192
AP for chair = 0.3803
AP for cow = 0.6386
AP for diningtable = 0.6300
AP for dog = 0.7174
AP for horse = 0.7696
AP for motorbike = 0.7248
AP for person = 0.6621
AP for pottedplant = 0.3198
AP for sheep = 0.6093
AP for sofa = 0.5662
AP for train = 0.7128
AP for tvmonitor = 0.6071
Mean AP = 0.6038
python eval.py -c configs/fastyolov1_voc.cfg -ckpt outputs/fastyolov1_voc/model_best.pth.tar ../datasets/voc
VOC07 metric? Yes
AP for aeroplane = 0.5515
AP for bicycle = 0.6446
AP for bird = 0.4649
AP for boat = 0.3989
AP for bottle = 0.1817
AP for bus = 0.6707
AP for car = 0.6120
AP for cat = 0.6896
AP for chair = 0.2574
AP for cow = 0.5105
AP for diningtable = 0.5809
AP for dog = 0.6595
AP for horse = 0.7308
AP for motorbike = 0.6273
AP for person = 0.5519
AP for pottedplant = 0.2394
AP for sheep = 0.4869
AP for sofa = 0.5197
AP for train = 0.6974
AP for tvmonitor = 0.5022
Mean AP = 0.5289

Demo

python demo.py -ct 0.2 configs/yolov1_s14_voc.cfg outputs/yolov1_s14_voc/model_best.pth.tar --exp voc assets/voc2007-test/

Maintainers

  • zhujian - Initial work - zjykzj

Thanks

Contributing

Anyone's participation is welcome! Open an issue or submit PRs.

Small note:

License

Apache License 2.0 © 2023 zjykzj

About

You Only Look Once: Unified, Real-Time Object Detection

https://arxiv.org/abs/1506.02640

License:Apache License 2.0


Languages

Language:Python 100.0%