This repository contains the official implementation of the following paper:
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection
Yuming Chen, Xinbin Yuan, Ruiqi Wu, Jiabao Wang, Qibin Hou, Ming-ming Cheng
Under review
[Homepage (TBD)] [Paper] [็ฅไน (TBD)] [AIWalker] [Poster (TBD)] [Video (TBD)]
![]() |
![]() |
- ๐ Table of Contents
- โจ News ๐
- ๐ ๏ธ Dependencies and Installation ๐
- ๐ค Training and Evaluation ๐
- ๐ก Model Zoo ๐
- ๐๏ธ Supported Tasks ๐
- ๐ Citation ๐
- ๐ License ๐
- ๐ฎ Contact ๐
- ๐ค Acknowledgement ๐
โจ News ๐
Future work can be found in todo.md.
- Aug, 2023: Our code is publicly available!
๐ ๏ธ Dependencies and Installation ๐
We provide a simple scrpit
install.sh
for installation, or refer to install.md for more details.
-
Clone and enter the repo.
git clone https://github.com/FishAndWasabi/YOLO-MS.git cd YOLO-MS
-
Run
install.sh
.bash install.sh
-
Activate your environment!
conda activate YOLO-MS
๐ค Training and Evaluation ๐
-
Training
1.1 Single GPU
python tools/train.py ${CONFIG_FILE} [optional arguments]
1.2 Multi GPU
CUDA_VISIBLE_DEVICES=x python tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
-
Evaluation
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
- Deployment
docker build docker/GPU/ -t mmdeploy:inside --build-arg USE_SRC_INSIDE=true
docker run --gpus all --name mmdeploy_yoloms -it mmdeploy:inside
docker cp deploy.sh mmdeploy_yoloms:/root/worksapce
docker cp ${CONFIG_FILE} mmdeploy_yoloms:/root/worksapce
docker cp ${CHECKPOINT_FILE} mmdeploy_yoloms:/root/worksapce
sh deploy_model.sh ${DEPLOY_CONFIG_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${WORK_DIR}
๐ก Model Zoo ๐
- YOLOv5-MS
- YOLOX-MS
- YOLOv6-MS
- YOLOv7-MS
- PPYOLOE-MS
- YOLOv8-MS
- YOLO-MS (Based on RTMDet)
1. YOLO-MS
Model | Resolution | Epoch | Params(M) | FLOPs(G) | |
|
|
|
Config | ๐ |
---|---|---|---|---|---|---|---|---|---|---|
XS | 640 | 300 | 4.5 | 8.7 | 43.1 | 24.0 | 47.8 | 59.1 | [config] | [model] |
XS* | 640 | 300 | 4.5 | 8.7 | 43.4 | 23.7 | 48.3 | 60.3 | [config] | [model] |
S | 640 | 300 | 8.1 | 15.6 | 46.2 | 27.5 | 50.6 | 62.9 | [config] | [model] |
S* | 640 | 300 | 8.1 | 15.6 | 46.2 | 26.9 | 50.5 | 63.0 | [config] | [model] |
- | 640 | 300 | 22.0 | 40.1 | 50.8 | 33.2 | 54.8 | 66.4 | [config] | [model] |
-* | 640 | 300 | 22.2 | 40.1 | 50.8 | 33.2 | 54.8 | 66.4 | [config] | [model] |
* refers to with SE attention
2. YOLOv6
Model | Resolution | Epoch | Params(M) | FLOPs(G) | |
|
|
|
Config | ๐ |
---|---|---|---|---|---|---|---|---|---|---|
t | 640 | 400 | 9.7 | 12.4 | 41.0 | 21.2 | 45.7 | 57.7 | [config] | [model] |
t-MS | 640 | 400 | 8.1 | 9.6 | 43.5 (+2.5) | 26.0 | 48.3 | 57.8 | [config] | [model] (TBD) |
3. YOLOv8
Model | Resolution | Epoch | Params(M) | FLOPs(G) | |
|
|
|
Config | ๐ |
---|---|---|---|---|---|---|---|---|---|---|
n | 640 | 500 | 2.9 | 4.4 | 37.2 | 18.9 | 40.5 | 52.5 | [config] | [model] |
n-MS | 640 | 500 | 2.9 | 4.4 | 40.3 (+3.1) | 22.0 | 44.6 | 53.7 | [config] | [model] |
๐๏ธ Supported Tasks ๐
- Object Detection
- Instance Segmentation (TBD)
- Rotated Object Detection (TBD)
- Object Tracking (TBD)
- Detection in Crowded Scene (TBD)
- Small Object Detection (TBD)
๐ Citation ๐
If you find our repo useful for your research, please cite us:
@misc{chen2023yoloms,
title={YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection},
author={Yuming Chen and Xinbin Yuan and Ruiqi Wu and Jiabao Wang and Qibin Hou and Ming-Ming Cheng},
year={2023},
eprint={2308.05480},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This project is based on the open source codebase MMYOLO.
@misc{mmyolo2022,
title={{MMYOLO: OpenMMLab YOLO} series toolbox and benchmark},
author={MMYOLO Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmyolo}},
year={2022}
}
๐ License ๐
Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.
๐ฎ Contact ๐
For technical questions, please contact chenyuming[AT]mail.nankai.edu.cn
.
For commercial licensing, please contact cmm[AT]nankai.edu.cn
and andrewhoux[AT]gmail.com
.
๐ค Acknowledgement ๐
This repo is modified from open source real-time object detection codebase MMYOLO. The README file is referred to LED and CrossKD