hllyzms / YOLO-MS

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

๐Ÿš€ YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

Python 3.8 pytorch 1.12.1 docs

This repository contains the official implementation of the following paper:

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection
Yuming Chen, Xinbin Yuan, Ruiqi Wu, Jiabao Wang, Qibin Hou, Ming-ming Cheng
Under review

[Homepage (TBD)] [Paper] [็ŸฅไนŽ (TBD)] [AIWalker] [Poster (TBD)] [Video (TBD)]

YOLOMS_TEASER0 YOLOMS_TEASER0

๐Ÿ“„ Table of Contents

โœจ News ๐Ÿ”

Future work can be found in todo.md.

  • Aug, 2023: Our code is publicly available!

๐Ÿ› ๏ธ Dependencies and Installation ๐Ÿ”

We provide a simple scrpit install.sh for installation, or refer to install.md for more details.

  1. Clone and enter the repo.

    git clone https://github.com/FishAndWasabi/YOLO-MS.git
    cd YOLO-MS
  2. Run install.sh.

    bash install.sh
  3. Activate your environment!

    conda activate YOLO-MS

๐Ÿค– Training and Evaluation ๐Ÿ”

  1. Training

    1.1 Single GPU

    python tools/train.py ${CONFIG_FILE} [optional arguments]

    1.2 Multi GPU

    CUDA_VISIBLE_DEVICES=x python tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
  2. Evaluation

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
  1. Deployment
docker build docker/GPU/ -t mmdeploy:inside --build-arg USE_SRC_INSIDE=true
docker run --gpus all --name mmdeploy_yoloms  -it mmdeploy:inside
docker cp deploy.sh mmdeploy_yoloms:/root/worksapce
docker cp ${CONFIG_FILE}  mmdeploy_yoloms:/root/worksapce
docker cp ${CHECKPOINT_FILE} mmdeploy_yoloms:/root/worksapce
sh deploy_model.sh ${DEPLOY_CONFIG_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${WORK_DIR}

๐Ÿก Model Zoo ๐Ÿ”

1. YOLO-MS
Model Resolution Epoch Params(M) FLOPs(G) $AP$ $AP_s$ $AP_m$ $AP_l$ Config ๐Ÿ”—
XS 640 300 4.5 8.7 43.1 24.0 47.8 59.1 [config] [model]
XS* 640 300 4.5 8.7 43.4 23.7 48.3 60.3 [config] [model]
S 640 300 8.1 15.6 46.2 27.5 50.6 62.9 [config] [model]
S* 640 300 8.1 15.6 46.2 26.9 50.5 63.0 [config] [model]
- 640 300 22.0 40.1 50.8 33.2 54.8 66.4 [config] [model]
-* 640 300 22.2 40.1 50.8 33.2 54.8 66.4 [config] [model]

* refers to with SE attention

2. YOLOv6
Model Resolution Epoch Params(M) FLOPs(G) $AP$ $AP_s$ $AP_m$ $AP_l$ Config ๐Ÿ”—
t 640 400 9.7 12.4 41.0 21.2 45.7 57.7 [config] [model]
t-MS 640 400 8.1 9.6 43.5 (+2.5) 26.0 48.3 57.8 [config] [model] (TBD)
3. YOLOv8
Model Resolution Epoch Params(M) FLOPs(G) $AP$ $AP_s$ $AP_m$ $AP_l$ Config ๐Ÿ”—
n 640 500 2.9 4.4 37.2 18.9 40.5 52.5 [config] [model]
n-MS 640 500 2.9 4.4 40.3 (+3.1) 22.0 44.6 53.7 [config] [model]

๐Ÿ—๏ธ Supported Tasks ๐Ÿ”

  • Object Detection
  • Instance Segmentation (TBD)
  • Rotated Object Detection (TBD)
  • Object Tracking (TBD)
  • Detection in Crowded Scene (TBD)
  • Small Object Detection (TBD)

๐Ÿ“– Citation ๐Ÿ”

If you find our repo useful for your research, please cite us:

@misc{chen2023yoloms,
      title={YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection},
      author={Yuming Chen and Xinbin Yuan and Ruiqi Wu and Jiabao Wang and Qibin Hou and Ming-Ming Cheng},
      year={2023},
      eprint={2308.05480},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

This project is based on the open source codebase MMYOLO.

@misc{mmyolo2022,
    title={{MMYOLO: OpenMMLab YOLO} series toolbox and benchmark},
    author={MMYOLO Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmyolo}},
    year={2022}
}

๐Ÿ“œ License ๐Ÿ”

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.

๐Ÿ“ฎ Contact ๐Ÿ”

For technical questions, please contact chenyuming[AT]mail.nankai.edu.cn. For commercial licensing, please contact cmm[AT]nankai.edu.cn and andrewhoux[AT]gmail.com.

๐Ÿค Acknowledgement ๐Ÿ”

This repo is modified from open source real-time object detection codebase MMYOLO. The README file is referred to LED and CrossKD

About

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

License:Other


Languages

Language:Python 95.7%Language:Shell 4.3%