awesome-computer-vision-models
This is the list with popular classification and segmentation models related with corresponding evaluation metrics.
Classification models
Model
Number of parameters
Top-1 Error
Top-5 Error
AlexNet
61.1M
44.12
21.26
VGG-16
138.3M
26.78
8.69
ResNet-50
25.5M
23.50
6.87
Inception v3
23.8M
21.2
5.6
PreResNet-50
25.5M
23.39
6.68
DenseNet-121
7.9M
25.0
7.71
PyramidNet-200(a=300)
62.1M
19.5
4.8
PyramidNet-200(a=450)
116.4M
19.2
4.7
ResNeXt-101
83.5M
20.4
5.3
WRN-50-2-bottleneck
68.9M
21.9
6.03
Xception
?
21.0
5.5
Inception-ResNet-v2
55.9M
19.9
4.9
Inception-v4
42.6M
20.0
5.0
Very Deep PolyNet
?
18.71
4.25
DarkNet Ref
7.3M
38.09
16.71
Attention-92
51.3M
19.5
4.8
CondenseNet (G=C=8)
4.8M
26.2
8.3
DRN-A-50
25.6M
22.94
6.57
DPN-131
79.3M
18.55
4.16
ShuffleNet 2×(g=3)
?
26.3
?
DiracNet-34
21.8M
27.79
9.34
SENet-154
115.2M
18.84
4.65
MobileNet
4.2M
29.4
10.5
NASNet-A
5.3M
26.0
8.7
AirNet50-1x64d (r=2)
27.43M
22.48
6.21
BAM-ResNet-50
25.92M
23.68
6.96
CBAM-ResNet-50
28.1M
23.02
6.38
SqueezeResNet
1.23M
39.83
17.84
2.0-SqNxt-23v5
3.2M
32.56
11.8
ShuffleNet v2 2x SE
7.6M
24.6
?
456-MENet-24×1(g=3)
5.3M
28.4
9.8
FD-MobileNet 1x
2.9M
34.7
?
MobileNetV2
3.4M
28.0
?
IGCV3
3.5M
28.22
9.54
DARTS
4.9M
26.9
9.0
PNASNet-5
5.1M
25.8
8.1
AmoebaNet-C
5.1M
24.3
7.6
MnasNet-92 (+SE)
5.1M
23.87
7.15
IBN-Net50-a
?
22.54
6.32
MarginNet
?
22.0
?
A^2 Net
?
23.0
6.5
FishNeXt-150
26.2M
21.5
?
Shape-ResNet
25.5M
23.28
6.72
ResNet-50-Bin-5
?
23.0
?
SimCNN(k=3 train)
?
28.4
10.2
SKNet-50
27.5M
20.79
?
SRM-ResNet-50
25.62M
22.87
6.49
EfficientNet-B7
66M
15.6
2.9
ProxylessNAS
?
24.9
7.5
MixNet-L
7.3M
21.1
5.8
Segmentation models
Semantic segmentation
Model
PASCAL-Context
Cityscapes (mIOU)
PASCAL VOC 2012 (mIOU)
COCO Stuff
ADE20K VAL (mIOU)
U-Net
?
?
?
?
?
DeconvNet
?
?
72.5
?
?
ParseNet
40.4
?
69.8
?
?
Piecewise
43.3
71.6
78.0
?
?
SegNet
?
56.1
?
?
?
FCN
37.8
65.3
62.2
22.7
29.39
ENet
?
58.3
?
?
?
DilatedNet
?
?
67.6
?
32.31
PixelNet
?
?
69.8
?
?
RefineNet
47.3
73.6
83.4
33.6
40.70
LRR
?
71.8
79.3
?
?
FRRN
?
71.8
?
?
?
MultiNet
?
?
?
?
?
DeepLab
45.7
64.8
79.7
?
?
LinkNet
?
?
?
?
?
Tiramisu
?
?
?
?
?
ICNet
?
70.6
?
?
?
ERFNet
?
68.0
?
?
?
PSPNet
47.8
80.2
85.4
?
44.94
GCN
?
76.9
82.2
?
?
Segaware
?
?
69.0
?
?
PixelDCN
?
?
73.0
?
?
DeepLabv3
?
?
85.7
?
?
DUC, HDC
?
77.1
?
?
?
ShuffleSeg
?
59.3
?
?
?
AdaptSegNet
?
46.7
?
?
?
TuSimple-DUC
80.1
?
83.1
?
?
R2U-Net
?
?
?
?
?
Attention U-Net
?
?
?
?
?
DANet
52.6
81.5
?
39.7
?
ENCNet
51.7
75.8
85.9
?
44.65
ShelfNet
48.4
75.8
84.2
?
?
LadderNet
?
?
?
?
?
CCC-ERFnet
?
69.01
?
?
?
DifNet-101
45.1
?
73.2
?
?
BiSeNet(Res18)
?
?
74.7
28.1
?
ESPNet
?
?
63.01
?
?
SPADE
?
62.3
?
37.4
38.5
SeamlessSeg
?
77.5
?
?
?
Detection models
[R-CNN] Rich feature hierarchies for accurate object detection and semantic segmentation | Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik | [CVPR' 14] |[pdf]
[official code - caffe]
[2014]
[OverFeat] OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks | Pierre Sermanet, et al. | [ICLR' 14] |[pdf]
[official code - torch]
[2014]
[MultiBox] Scalable Object Detection using Deep Neural Networks | Dumitru Erhan, et al. | [CVPR' 14] |[pdf]
[2014]
[SPP-Net] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition | Kaiming He, et al. | [ECCV' 14] |[pdf]
[official code - caffe]
[unofficial code - keras]
[unofficial code - tensorflow]
[2014]
[MR-CNN] Object detection via a multi-region & semantic segmentation-aware CNN model | Spyros Gidaris, Nikos Komodakis | [ICCV' 15] |[pdf]
[official code - caffe]
[2015]
[DeepBox] DeepBox: Learning Objectness with Convolutional Networks | Weicheng Kuo, Bharath Hariharan, Jitendra Malik | [ICCV' 15] |[pdf]
[official code - caffe]
[2015]
[AttentionNet] AttentionNet: Aggregating Weak Directions for Accurate Object Detection | Donggeun Yoo, et al. | [ICCV' 15] |[pdf]
[2015]
[Fast R-CNN] Fast R-CNN | Ross Girshick | [ICCV' 15] |[pdf]
[official code - caffe]
[2015]
[DeepProposal] DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers | Amir Ghodrati, et al. | [ICCV' 15] |[pdf]
[official code - matconvnet]
[2015]
[Faster R-CNN, RPN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | Shaoqing Ren, et al. | [NIPS' 15] |[pdf]
[official code - caffe]
[unofficial code - tensorflow]
[unofficial code - pytorch]
[YOLO v1] You Only Look Once: Unified, Real-Time Object Detection | Joseph Redmon, et al. | [CVPR' 16] |[pdf]
[official code - c]
[2016]
[G-CNN] G-CNN: an Iterative Grid Based Object Detector | Mahyar Najibi, et al. | [CVPR' 16] |[pdf]
[2016]
[AZNet] Adaptive Object Detection Using Adjacency and Zoom Prediction | Yongxi Lu, Tara Javidi. | [CVPR' 16] |[pdf]
[2016]
[ION] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks | Sean Bell, et al. | [CVPR' 16] |[pdf]
[2016]
[HyperNet] HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection | Tao Kong, et al. | [CVPR' 16] |[pdf]
[2016]
[OHEM] Training Region-based Object Detectors with Online Hard Example Mining | Abhinav Shrivastava, et al. | [CVPR' 16] |[pdf]
[official code - caffe]
[2016]
[CRAPF] CRAFT Objects from Images | Bin Yang, et al. | [CVPR' 16] |[pdf]
[official code - caffe]
[2016]
[MPN] A MultiPath Network for Object Detection | Sergey Zagoruyko, et al. | [BMVC' 16] |[pdf]
[official code - torch]
[2016]
[SSD] SSD: Single Shot MultiBox Detector | Wei Liu, et al. | [ECCV' 16] |[pdf]
[official code - caffe]
[unofficial code - tensorflow]
[unofficial code - pytorch]
[2016]
[GBDNet] Crafting GBD-Net for Object Detection | Xingyu Zeng, et al. | [ECCV' 16] |[pdf]
[official code - caffe]
[2016]
[CPF] Contextual Priming and Feedback for Faster R-CNN | Abhinav Shrivastava and Abhinav Gupta | [ECCV' 16] |[pdf]
[2016]
[MS-CNN] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection | Zhaowei Cai, et al. | [ECCV' 16] |[pdf]
[official code - caffe]
[2016]
[R-FCN] R-FCN: Object Detection via Region-based Fully Convolutional Networks | Jifeng Dai, et al. | [NIPS' 16] |[pdf]
[official code - caffe]
[unofficial code - caffe]
[2016]
[PVANET] PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection | Kye-Hyeon Kim, et al. | [NIPSW' 16] |[pdf]
[official code - caffe]
[2016]
[DeepID-Net] DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection | Wanli Ouyang, et al. | [PAMI' 16] |[pdf]
[2016]
[NoC] Object Detection Networks on Convolutional Feature Maps | Shaoqing Ren, et al. | [TPAMI' 16] |[pdf]
[DSSD] DSSD : Deconvolutional Single Shot Detector | Cheng-Yang Fu1, et al. | [arXiv' 17] |[pdf]
[official code - caffe]
[2017]
[TDM] Beyond Skip Connections: Top-Down Modulation for Object Detection | Abhinav Shrivastava, et al. | [CVPR' 17] |[pdf]
[2017]
[FPN] Feature Pyramid Networks for Object Detection | Tsung-Yi Lin, et al. | [CVPR' 17] |[pdf]
[unofficial code - caffe]
[2017]
[YOLO v2] YOLO9000: Better, Faster, Stronger | Joseph Redmon, Ali Farhadi | [CVPR' 17] |[pdf]
[official code - c]
[unofficial code - caffe]
[unofficial code - tensorflow]
[unofficial code - tensorflow]
[unofficial code - pytorch]
[2017]
[RON] RON: Reverse Connection with Objectness Prior Networks for Object Detection | Tao Kong, et al. | [CVPR' 17] |[pdf]
[official code - caffe]
[unofficial code - tensorflow]
[2017]
[DCN] Deformable Convolutional Networks | Jifeng Dai, et al. | [ICCV' 17] |[pdf]
[official code - mxnet]
[unofficial code - tensorflow]
[unofficial code - pytorch]
[2017]
[DeNet] DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling | Lachlan Tychsen-Smith, Lars Petersson | [ICCV' 17] |[pdf]
[official code - theano]
[2017]
[CoupleNet] CoupleNet: Coupling Global Structure with Local Parts for Object Detection | Yousong Zhu, et al. | [ICCV' 17] |[pdf]
[official code - caffe]
[2017]
[RetinaNet] Focal Loss for Dense Object Detection | Tsung-Yi Lin, et al. | [ICCV' 17] |[pdf]
[official code - keras]
[unofficial code - pytorch]
[unofficial code - mxnet]
[unofficial code - tensorflow]
[2017]
[Mask R-CNN] Mask R-CNN | Kaiming He, et al. | [ICCV' 17] |[pdf]
[official code - caffe2]
[unofficial code - tensorflow]
[unofficial code - tensorflow]
[unofficial code - pytorch]
[2017]
[DSOD] DSOD: Learning Deeply Supervised Object Detectors from Scratch | Zhiqiang Shen, et al. | [ICCV' 17] |[pdf]
[official code - caffe]
[unofficial code - pytorch]
[2017]
[SMN] Spatial Memory for Context Reasoning in Object Detection | Xinlei Chen, Abhinav Gupta | [ICCV' 17] |[pdf]
[2017]
[YOLO v3] YOLOv3: An Incremental Improvement | Joseph Redmon, Ali Farhadi | [arXiv' 18] |[pdf]
[official code - c]
[unofficial code - pytorch]
[unofficial code - pytorch]
[unofficial code - keras]
[unofficial code - tensorflow]
[2018]
[ZIP] Zoom Out-and-In Network with Recursive Training for Object Proposal | Hongyang Li, et al. | [IJCV' 18] |[pdf]
[official code - caffe]
[2018]
[SIN] Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships | Yong Liu, et al. | [CVPR' 18] |[pdf]
[official code - tensorflow]
[2018]
[STDN] Scale-Transferrable Object Detection | Peng Zhou, et al. | [CVPR' 18] |[pdf]
[RefineDet] Single-Shot Refinement Neural Network for Object Detection | Shifeng Zhang, et al. | [CVPR' 18] |[pdf]
[official code - caffe]
[unofficial code - chainer]
[unofficial code - pytorch]
[2018]
[MegDet] MegDet: A Large Mini-Batch Object Detector | Chao Peng, et al. | [CVPR' 18] |[pdf]
[2018]
[DA Faster R-CNN] Domain Adaptive Faster R-CNN for Object Detection in the Wild | Yuhua Chen, et al. | [CVPR' 18] |[pdf]
[official code - caffe]
[2018]
[SNIP] An Analysis of Scale Invariance in Object Detection – SNIP | Bharat Singh, Larry S. Davis | [CVPR' 18] |[pdf]
[2018]
[Relation-Network] Relation Networks for Object Detection | Han Hu, et al. | [CVPR' 18] |[pdf]
[official code - mxnet]
[2018]
[Cascade R-CNN] Cascade R-CNN: Delving into High Quality Object Detection | Zhaowei Cai, et al. | [CVPR' 18] |[pdf]
[official code - caffe]
[2018]
Finding Tiny Faces in the Wild with Generative Adversarial Network | Yancheng Bai, et al. | [CVPR' 18] |[pdf]
[2018]
[STDnet] STDnet: A ConvNet for Small Target Detection | Brais Bosquet, et al. | [BMVC' 18] |[pdf]
[2018]
[RFBNet] Receptive Field Block Net for Accurate and Fast Object Detection | Songtao Liu, et al. | [ECCV' 18] |[pdf]
[official code - pytorch]
[2018]
Zero-Annotation Object Detection with Web Knowledge Transfer | Qingyi Tao, et al. | [ECCV' 18] |[pdf]
[2018]
[CornerNet] CornerNet: Detecting Objects as Paired Keypoints | Hei Law, et al. | [ECCV' 18] |[pdf]
[official code - pytorch]
[2018]
[Pelee] Pelee: A Real-Time Object Detection System on Mobile Devices | Jun Wang, et al. | [NIPS' 18] |[pdf]
[official code - caffe]
[2018]
[HKRM] Hybrid Knowledge Routed Modules for Large-scale Object Detection | ChenHan Jiang, et al. | [NIPS' 18] |[pdf]
[2018]
[MetaAnchor] MetaAnchor: Learning to Detect Objects with Customized Anchors | Tong Yang, et al. | [NIPS' 18] |[pdf]
[2018]
[M2Det] M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network | Jun Wang, et al. | [AAAI' 19] |[pdf]
[2018]
[Libra RetinaNet] Libra R-CNN: Towards Balanced Learning for Object Detection | Jiangmiao Pang, et al. | [pdf]
[2019]
[YOLACT-700] YOLACT Real-time Instance Segmentation | [pdf]
[2019]
Detector
VOC07 (mAP@IoU=0.5)
VOC12 (mAP@IoU=0.5)
COCO (mAP)
R-CNN
58.5
-
-
OverFeat
-
-
-
MultiBox
29.0
-
-
SPP-Net
59.2
-
-
MR-CNN
78.2
73.9
-
AttentionNet
-
-
-
Fast R-CNN
70.0
68.4
-
Faster R-CNN
73.2
70.4
36.8
YOLO v1
66.4
57.9
-
G-CNN
66.8
66.4
-
AZNet
70.4
-
22.3
ION
80.1
77.9
33.1
HyperNet
76.3
71.4
-
OHEM
78.9
76.3
22.4
MPN
-
-
33.2
SSD
76.8
74.9
31.2
GBDNet
77.2
-
27.0
CPF
76.4
72.6
-
MS-CNN
-
-
-
R-FCN
79.5
77.6
29.9
PVANET
-
-
-
DeepID-Net
69.0
-
-
NoC
71.6
68.8
27.2
DSSD
81.5
80.0
-
TDM
-
-
37.3
FPN
-
-
36.2
YOLO v2
78.6
73.4
21.6
RON
77.6
75.4
-
DCN
-
-
-
DeNet
77.1
73.9
33.8
CoupleNet
82.7
80.4
34.4
RetinaNet
-
-
39.1
Mask R-CNN
-
-
39.8
DSOD
77.7
76.3
-
SMN
70.0
-
-
YOLO v3
-
-
33.0
SIN
76.0
73.1
23.2
STDN
80.9
-
-
RefineDet
83.8
83.5
41.8
MegDet
-
-
-
RFBNet
82.2
-
-
CornerNet
-
-
42.1
LibraRetinaNet
-
-
43.0
YOLACT-700
-
-
31.2