MMountains / RotationDetection

This is a tensorflow-based rotation detection benchmark, also called UranusDet.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UranusDet

License

Abstract

This is a tensorflow-based rotation detection benchmark, also called UranusDet.
UranusDet is maintained by Xue Yang with Shanghai Jiao Tong University supervised by Prof. Junchi Yan.

Papers and codes related to remote sensing/aerial image detection: DOTA-DOAI .

Techniques:

The above-mentioned rotation detectors are all modified based on the following horizontal detectors:

3

Projects

0

Latest Performance

More results and trained models are available in the MODEL_ZOO.md.

DOTA (Task1)

Base setting:

Backbone Neck Training/test dataset Data Augmentation Epoch
ResNet50_v1d 600->800 FPN trainval/test × 13 (AP50) or 17 (AP50:95) is enough for baseline (default is 13)
Method Baseline DOTA1.0 Model DOTA1.5 Model DOTA2.0 Model Anchor Angle Pred. Reg. Loss Angle Range Configs
- RetinaNet-R 67.25 Baidu Drive (v7pe) 56.50 Baidu Drive (9oea) 42.04 Baidu Drive (3dud) R Reg. (∆⍬) smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
- RetinaNet-H 64.17 Baidu Drive (j5l0) 56.10 Baidu Drive (70lo) 43.06 Baidu Drive (5kb2) H Reg. (∆⍬) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
- RetinaNet-H 65.17 Baidu Drive (b3f5) 58.25 Baidu Drive (4u6d) 44.05 Baidu Drive (5pn3) H Reg. (sin⍬, cos⍬) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
- RetinaNet-H 65.73 Baidu Drive (jum2) 58.87 Baidu Drive (lld0) 44.16 Baidu Drive (ffmo) H Reg. (∆⍬) smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
IoU-Smooth L1 RetinaNet-H 66.99 Baidu Drive (bc83) 59.17 Baidu Drive (3r1n) 46.31 Baidu Drive (njpc) H Reg. (∆⍬) iou-smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
RIDet RetinaNet-H 66.06 Baidu Drive (0u9r) 58.91 Baidu Drive (sokc) 45.35 Baidu Drive (k8gq) H Quad. hungarian loss - dota1.0, dota1.5, dota2.0
RSDet RetinaNet-H 67.27 Baidu Drive (6nt5) 61.42 Baidu Drive (vpmm) 46.71 Baidu Drive (p48g) H Quad. modulated loss - dota1.0, dota1.5, dota2.0
CSL RetinaNet-H 67.38 Baidu Drive (g3wt) 58.55 Baidu Drive (6emh) 43.34 Baidu Drive (l5cw) H Cls.: Gaussian (r=1, w=10) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
DCL RetinaNet-H 67.39 Baidu Drive (p9tu) 59.38 Baidu Drive (wb1q) 45.46 Baidu Drive (qjjs) H Cls.: BCL (w=180/256) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
- FCOS 67.69 Baidu Drive (r4w6) 61.05 Baidu Drive (3rhu) 48.10 Baidu Drive (bi12) - Quad smooth L1 - dota1.0, dota1.5, dota2.0
RSDet FCOS 67.91 Baidu Drive (tk89) 62.18 Baidu Drive (mjf9) 48.81 Baidu Drive (snul) - Quad modulated loss - dota1.0, dota1.5 dota2.0
GWD RetinaNet-H 68.93 Baidu Drive (0w51) 60.03 Baidu Drive (4p2r) 46.65 Baidu Drive (9kwq) H Reg. (∆⍬) gwd [-90,0) dota1.0, dota1.5, dota2.0
GWD + SWA RetinaNet-H 69.92 Baidu Drive (qi7o) 60.60 Baidu Drive (ah0b) 47.63 Baidu Drive (u0gu) H Reg. (∆⍬) gwd [-90,0) dota1.0, dota1.5, dota2.0
KLD RetinaNet-H 71.28 Baidu Drive (ob8f) 62.50 Baidu Drive (hpsr) 47.69 Baidu Drive (wh45) H Reg. (∆⍬) kld [-90,0) dota1.0, dota1.5, dota2.0
R3Det RetinaNet-H 70.66 Baidu Drive (30lt) 62.91 Baidu Drive (rdc6) 48.43 Baidu Drive (k9bo) H->R Reg. (∆⍬) smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
DCL R3Det 71.21 Baidu Drive (jueq) 61.98 Baidu Drive (cjqm) 48.71 Baidu Drive (jb5s) H->R Cls.: BCL (w=180/256) iou-smooth L1 [-90,0)->[-90,90) dota1.0, dota1.5, dota2.0
GWD R3Det 71.56 Baidu Drive (8962) 63.22 Baidu Drive (c3jk) 49.25 Baidu Drive (o3dw) H->R Reg. (∆⍬) smooth L1->gwd [-90,0) dota1.0, dota1.5, dota2.0
KLD R3Det 71.73 Baidu Drive (go1m) 65.18 Baidu Drive (qwwa) 50.90 Baidu Drive (hc49) H->R Reg. (∆⍬) kld [-90,0) dota1.0, dota1.5, dota2.0
- R2CNN (Faster-RCNN) 72.27 Baidu Drive (7o5p) 66.45 Baidu Drive (ho88) 52.35 Baidu Drive (suwu) H->R Reg. (∆⍬) smooth L1 [-90,0) dota1.0, dota1.5 dota2.0

Note:

  • Single GPU training: SAVE_WEIGHTS_INTE = iter_epoch * 1 (DOTA1.0: iter_epoch=27000, DOTA1.5: iter_epoch=32000, DOTA2.0: iter_epoch=40000)
  • Multi-GPU training (better): SAVE_WEIGHTS_INTE = iter_epoch * 2

My Development Environment

docker images: yangxue2docker/yx-tf-det:tensorflow1.13.1-cuda10-gpu-py3 or yangxue2docker/py3-tf1.15.2-nv-torch1.8.0-cuda11:v1.0

  1. python3.5 (anaconda recommend)
  2. cuda 10.0
  3. opencv-python 4.1.1.26
  4. tfplot 0.2.0 (optional)
  5. tensorflow-gpu 1.13
  6. tqdm 4.54.0
  7. Shapely 1.7.1

Note: For 30xx series graphics cards, I recommend this blog to install tf1.xx, or refer to ngc and tensorflow-release-notes to download docker image according to your environment, or just use my docker image (yangxue2docker/py3-tf1.15.2-nv-torch1.8.0-cuda11:v1.0)

Download Model

Pretrain weights

Download a pretrain weight you need from the following three options, and then put it to $PATH_ROOT/dataloader/pretrained_weights.

  1. MxNet pretrain weights (recommend in this repo, default in NET_NAME): resnet_v1d, resnet_v1b, refer to gluon2TF.
  1. Tensorflow pretrain weights: resnet50_v1, resnet101_v1, resnet152_v1, efficientnet, mobilenet_v2, darknet53 (Baidu Drive (1jg2), Google Drive).
  2. Pytorch pretrain weights, refer to pretrain_zoo.py and Others.

Trained weights

  1. Please download trained models by this project, then put them to $PATH_ROOT/output/pretained_weights.

Compile

```  
cd $PATH_ROOT/libs/utils/cython_utils
rm *.so
rm *.c
rm *.cpp
python setup.py build_ext --inplace (or make)

cd $PATH_ROOT/libs/utils/
rm *.so
rm *.c
rm *.cpp
python setup.py build_ext --inplace
```

Train

  1. If you want to train your own dataset, please note:

    (1) Select the detector and dataset you want to use, and mark them as #DETECTOR and #DATASET (such as #DETECTOR=retinanet and #DATASET=DOTA)
    (2) Modify parameters (such as CLASS_NUM, DATASET_NAME, VERSION, etc.) in $PATH_ROOT/libs/configs/#DATASET/#DETECTOR/cfgs_xxx.py
    (3) Copy $PATH_ROOT/libs/configs/#DATASET/#DETECTOR/cfgs_xxx.py to $PATH_ROOT/libs/configs/cfgs.py
    (4) Add category information in $PATH_ROOT/libs/label_name_dict/label_dict.py     
    (5) Add data_name to $PATH_ROOT/dataloader/dataset/read_tfrecord.py  
    
  2. Make tfrecord
    If image is very large (such as DOTA dataset), the image needs to be cropped. Take DOTA dataset as a example:

    cd $PATH_ROOT/dataloader/dataset/DOTA
    python data_crop.py
    

    If image does not need to be cropped, just convert the annotation file into xml format, refer to example.xml.

    cd $PATH_ROOT/dataloader/dataset/  
    python convert_data_to_tfrecord.py --root_dir='/PATH/TO/DOTA/' 
                                       --xml_dir='labeltxt'
                                       --image_dir='images'
                                       --save_name='train' 
                                       --img_format='.png' 
                                       --dataset='DOTA'
    
  3. Start training

    cd $PATH_ROOT/tools/#DETECTOR
    python train.py
    

Test

  1. For large-scale image, take DOTA dataset as a example (the output file or visualization is in $PATH_ROOT/tools/#DETECTOR/test_dota/VERSION):

    cd $PATH_ROOT/tools/#DETECTOR
    python test_dota.py --test_dir='/PATH/TO/IMAGES/'  
                        --gpus=0,1,2,3,4,5,6,7  
                        -ms (multi-scale testing, optional)
                        -s (visualization, optional)
    
    or (better than multi-scale testing)
    
    python test_dota_sota.py --test_dir='/PATH/TO/IMAGES/'  
                             --gpus=0,1,2,3,4,5,6,7  
                             -s (visualization, optional)
    

    Notice: In order to set the breakpoint conveniently, the read and write mode of the file is' a+'. If the model of the same #VERSION needs to be tested again, the original test results need to be deleted.

  2. For small-scale image, take HRSC2016 dataset as a example:

    cd $PATH_ROOT/tools/#DETECTOR
    python test_hrsc2016.py --test_dir='/PATH/TO/IMAGES/'  
                            --gpu=0
                            --image_ext='bmp'
                            --test_annotation_path='/PATH/TO/ANNOTATIONS'
                            -s (visualization, optional)
    

Tensorboard

cd $PATH_ROOT/output/summary
tensorboard --logdir=.

1

2

Citation

If you find our code useful for your research, please consider cite.

@article{yang2021learning,
    title={Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence},
    author={Yang, Xue and Yang, Xiaojiang and Yang, Jirui and Ming, Qi and Wang, Wentao and Tian, Qi and Yan, Junchi},
    journal={arXiv preprint arXiv:2106.01883},
    year={2021}
}

@inproceedings{yang2021rethinking,
    title={Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss},
    author={Yang, Xue and Yan, Junchi and Qi, Ming and Wang, Wentao and Xiaopeng, Zhang and Qi, Tian},
    booktitle={International Conference on Machine Learning (ICML)},
    year={2021}
}

@article{ming2021optimization,
    title={Optimization for Oriented Object Detection via Representation Invariance Loss},
    author={Ming, Qi and Zhou, Zhiqiang and Miao, Lingjuan and Yang, Xue and Dong, Yunpeng},
    journal={arXiv preprint arXiv:2103.11636},
    year={2021}
}

@inproceedings{yang2021dense,
    title={Dense Label Encoding for Boundary Discontinuity Free Rotation Detection},
    author={Yang, Xue and Hou, Liping and Zhou, Yue and Wang, Wentao and Yan, Junchi},
    booktitle={Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR)},
    month={June},
    year={2021},
    pages={15819-15829}
}

@inproceedings{yang2020arbitrary,
    title={Arbitrary-oriented object detection with circular smooth label},
    author={Yang, Xue and Yan, Junchi},
    booktitle={European Conference on Computer Vision (ECCV)},
    pages={677--694},
    year={2020},
    organization={Springer}
}

@inproceedings{yang2021r3det,
    title={R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object},
    author={Yang, Xue and Yan, Junchi and Feng, Ziming and He, Tao},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},
    year={2021}
}

@inproceedings{qian2021learning,
    title={Learning modulated loss for rotated object detection},
    author={Qian, Wen and Yang, Xue and Peng, Silong and Yan, Junchi and Guo, Yue },
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},
    year={2021}
}

@article{yang2020scrdet++,
    title={SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing},
    author={Yang, Xue and Yan, Junchi and Yang, Xiaokang and Tang, Jin and Liao, Wenglong and He, Tao},
    journal={arXiv preprint arXiv:2004.13316},
    year={2020}
}

@inproceedings{yang2019scrdet,
    title={SCRDet: Towards more robust detection for small, cluttered and rotated objects},
    author={Yang, Xue and Yang, Jirui and Yan, Junchi and Zhang, Yue and Zhang, Tengfei and Guo, Zhi and Sun, Xian and Fu, Kun},
    booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
    pages={8232--8241},
    year={2019}
}

Reference

1、https://github.com/endernewton/tf-faster-rcnn
2、https://github.com/zengarden/light_head_rcnn
3、https://github.com/tensorflow/models/tree/master/research/object_detection
4、https://github.com/fizyr/keras-retinanet

About

This is a tensorflow-based rotation detection benchmark, also called UranusDet.

License:Apache License 2.0


Languages

Language:Python 99.3%Language:Cuda 0.7%Language:C++ 0.0%Language:Makefile 0.0%