awesome-computer-vision-models

This is a list with popular classification and segmentation models with corresponding evaluation metrics.

You can check some of the models using tensorflow.js demo application.

Classification models

AlexNet ('One weird trick for parallelizing convolutional neural networks') [2014]
VGG/BN-VGG ('Very Deep Convolutional Networks for Large-Scale Image Recognition') [2014]
ResNet ('Deep Residual Learning for Image Recognition') [2015]
InceptionV3 ('Rethinking the Inception Architecture for Computer Vision') [2015]
PreResNet ('Identity Mappings in Deep Residual Networks') [2016]
DenseNet ('Densely Connected Convolutional Networks') [2016]
PyramidNet ('Deep Pyramidal Residual Networks') [2016]
ResNeXt ('Aggregated Residual Transformations for Deep Neural Networks') [2016]
WRN ('Wide Residual Networks') [2016]
Xception ('Xception: Deep Learning with Depthwise Separable Convolutions') [2016]
InceptionV4/InceptionResNetV2 ('Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning') [2016]
PolyNet ('PolyNet: A Pursuit of Structural Diversity in Very Deep Networks') [2016]
DarkNet ('Darknet: Open source neural networks in C') [2016?]
ResAttNet ('Residual Attention Network for Image Classification') [2017]
CondenseNet ('CondenseNet: An Efficient DenseNet using Learned Group Convolutions') [2017]
DRN-C/DRN-D ('Dilated Residual Networks') [2017]
DPN ('Dual Path Networks') [2017]
ShuffleNet ('ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices') [2017]
DiracNetV2 ('DiracNets: Training Very Deep Neural Networks Without Skip-Connections') [2017]]
SENet/SE-ResNet/SE-PreResNet/SE-ResNeXt ('Squeeze-and-Excitation Networks') [2017]
MobileNet ('MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications') [2017]
NASNet ('Learning Transferable Architectures for Scalable Image Recognition') [2017]
DLA ('Deep Layer Aggregation') [2017]
AirNet/AirNeXt ('Attention Inspiring Receptive-Fields Network for Learning Invariant Representations') [2018]
BAM-ResNet ('BAM: Bottleneck Attention Module') [2018]
CBAM-ResNet ('CBAM: Convolutional Block Attention Module') [2018]
SqueezeNet/SqueezeResNet ('SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size') [2016]
SqueezeNext ('SqueezeNext: Hardware-Aware Neural Network Design') [2018]
ShuffleNetV2 ('ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design') [2018]
MENet ('Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications') [2018]
FD-MobileNet ('FD-MobileNet: Improved MobileNet with A Fast Downsampling Strategy') [2018]
MobileNetV2 ('MobileNetV2: Inverted Residuals and Linear Bottlenecks') [2018]
IGCV3 ('IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks') [2018]
DARTS ('DARTS: Differentiable Architecture Search') [2018]
PNASNet ('Progressive Neural Architecture Search') [2018]
Amoeba ('Regularized Evolution for Image Classifier Architecture Search') [2018]
MnasNet ('MnasNet: Platform-Aware Neural Architecture Search for Mobile') [2018]
IBN-Net ('Two at Once: Enhancing Learning andGeneralization Capacities via IBN-Net') [2018]
MarginNet ('Large Margin Deep Networks for Classification') [2018]
A^2 Nets ('A^2-Nets: Double Attention Networks') [2018]
FishNet ('FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction') [2018]
Shape-ResNet ('IMAGENET-TRAINED CNNS ARE BIASED TOWARDS TEXTURE; INCREASING SHAPE BIAS IMPROVES ACCURACY AND ROBUSTNESS')[2019]
Shift-Invariant-ResNet ('Making Convolutional Networks Shift-Invariant Again')[2019]
SimCNN ('Greedy Layerwise Learning Can Scale to ImageNet')[2019]
SKNet ('Selective Kernel Networks')[2019]
SRM ('SRM : A Style-based Recalibration Module for Convolutional Neural Networks')[2019]
EfficientNet ('EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks')[2019]
ProxylessNAS ('PROXYLESSNAS: DIRECT NEURAL ARCHITECTURE SEARCH ON TARGET TASK AND HARDWARE')[2019]
MixNet ('MixNet: Mixed Depthwise Convolutional Kernels')[2019]
ECA-Net ('ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks')[2019]
ACNet-Densenet121 ('ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks')[2019]
LIP-* ('LIP: Local Importance-based Pooling')[2019]
MuffNet ('MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning')[2019]
*-Bin-5 ('Making Convolutional Networks Shift-Invariant Again')[2019]
FixRes ResNeXt-101 WSL ('Fixing the train-test resolution discrepancy')[2019]
Noisy Student (L2) (using extra-data) ('Self-training with Noisy Student improves ImageNet classification')[2019]

Model	Number of parameters	FLOPS	Top-1 Error	Top-5 Error	DEMO
AlexNet	62.3M	1,132.33M	40.96	18.24	X
VGG-16	138.3M	?	26.78	8.69	X
ResNet-10	5.5M	894.04M	34.69	14.36	Try live
ResNet-18	11.7M	1,820.41M	28.53	9.82	Try live
ResNet-34	21.8M	3,672.68M	24.84	7.80	Try live
ResNet-50	25.5M	3,877.95M	22.28	6.33	Try live
Inception v3	23.8M	?	21.2	5.6	X
PreResNet-18	11.7M	1,820.56M	28.43	9.72	Try live
PreResNet-34	21.8M	3,672.83M	24.89	7.74	Try live
PreResNet-50	25.6M	3,875.44M	22.40	6.47	Try live
DenseNet-121	8.0M	2,872.13M	23.48	7.04	Try live
DenseNet-161	28.7M	7,793.16M	22.86	6.44	X
PyramidNet-101	42.5M	8,743.54M	21.98	6.20	X
ResNeXt-14(32x4d)	9.5M	1,603.46M	30.32	11.46	Try live
ResNeXt-26(32x4d)	15.4M	2,488.07M	24.14	7.46	Try live
WRN-50-2	68.9M	11,405.42M	22.53	6.41	X
Xception	22,855,952	8,403.63M	20.97	5.49	X
InceptionV4	42,679,816	12,304.93M	20.64	5.29	X
InceptionResNetV2	55,843,464	13,188.64M	19.93	4.90	X
PolyNet	95,366,600	34,821.34M	19.10	4.52	X
InceptionResNetV2	55,843,464	13,188.64M	19.93	4.90	X
DarkNet Ref	7,319,416	367.59M	38.58	17.18	Try live
DarkNet Tiny	1,042,104	500.85M	40.74	17.84	Try live
DarkNet 53	41,609,928	7,133.86M	21.75	5.64	Try live
Attention-92	51.3M	?	19.5	4.8	X
CondenseNet (G=C=8)	4.8M	?	26.2	8.3	X
DPN-68	12,611,602	2,351.84M	23.24	6.79	Try live
ShuffleNet x1.0 (g=1)	1,531,936	148.13M	34.93	13.89	Try live
DiracNetV2-18	11,511,784	1,796.62M	31.47	11.70	Try live
DiracNetV2-34	21,616,232	3,646.93M	28.75	9.93	Try live
SENet-16	31,366,168	5,081.30M	25.65	8.20	Try live
SENet-154	115,088,984	20,745.78M	18.62	4.61	X
MobileNet x1.0	4,231,976	579.80M	26.61	8.95	Try live
NASNet-A 4@1056	5,289,978	584.90M	25.68	8.16	Try live
NASNet-A 6@4032	88,753,150	23,976.44M	18.14	4.21	X
DLA-34	15,742,104	3,071.37M	25.36	7.94	Try live
AirNet50-1x64d (r=2)	27.43M	?	22.48	6.21	X
BAM-ResNet-50	25.92M	?	23.68	6.96	X
CBAM-ResNet-50	28.1M	?	23.02	6.38	X
SqueezeResNet1.1	1,235,496	352.02M	40.09	18.21	Try live
SqueezeNet1.1	1,235,496	352.02M	39.31	17.72	Try live
1.0-SqNxt-23v5	921,816	285.82M	40.77	17.85	X
1.5-SqNxt-23v5	1,953,616	550.97M	33.81	13.01	X
2.0-SqNxt-23v5	3,366,344	897.60M	29.63	10.66	X
ShuffleNetV2 x1.0	2,278,604	149.72M	31.44	11.63	Try live
456-MENet-24×1(g=3)	5.3M	?	28.4	9.8	X
FD-MobileNet x1.0	2,901,288	147.46M	34.23	13.38	Try live
MobileNetV2 x1.0	3,504,960	329.36M	26.97	8.87	Try live
IGCV3	3.5M	?	28.22	9.54	X
DARTS	4.9M	?	26.9	9.0	X
PNASNet-5	5.1M	?	25.8	8.1	X
AmoebaNet-C	5.1M	?	24.3	7.6	X
MnasNet	4,308,816	317.67M	31.58	11.74	Try live
IBN-Net50-a	?	?	22.54	6.32	X
MarginNet	?	?	22.0	?	X
A^2 Net	?	?	23.0	6.5	X
FishNeXt-150	26.2M	?	21.5	?	X
Shape-ResNet	25.5M	?	23.28	6.72	X
ResNet-50-Bin-5	?	?	23.0	?	X
SimCNN(k=3 train)	?	?	28.4	10.2	X
SKNet-50	27.5M	?	20.79	?	X
SRM-ResNet-50	25.62M	?	22.87	6.49	X
EfficientNet-B0	5,288,548	414.31M	24.77	7.52	Try live
EfficientNet-B7b	66,347,960	39,010.98M	15.94	3.22	X
ProxylessNAS	?	?	24.9	7.5	X
MixNet-L	7.3M	?	21.1	5.8	X
ECA-Net50	24.37M	3.86G	22.52	6.32	X
ECA-Net101	7.3M	7.35G	21.35	5.66	X
ACNet-Densenet121	?	?	24.18	7.23	X
LIP-ResNet-50	23.9M	5.33G	21.81	6.04	X
LIP-ResNet-101	42.9M	9.06G	20.67	5.40	X
LIP-DenseNet-BC-121	8.7M	4.13G	23.36	6.84	X
MuffNet_1.0	2.3M	146M	30.1	?	X
MuffNet_1.5	3.4M	300M	26.9	?	X
ResNet-34-Bin-5	21.8M	3,672.68M	25.80	?	X
ResNet-50-Bin-5	25.5M	3,877.95M	22.96	?	X
MobileNetV2-Bin-5	3,504,960	329.36M	27.50	?	X
FixRes ResNeXt101 WSL	829M	?	13.6	2.0	X
Noisy Student*(L2)	480M	?	12.6	1.8	X

Segmentation models

Semantic segmentation

U-Net ('U-Net: Convolutional Networks for Biomedical Image Segmentation') [2015]
DeconvNet ('Learning Deconvolution Network for Semantic Segmentation') [2015]
ParseNet ('ParseNet: Looking Wider to See Better') [2015]
Piecewise ('Efficient piecewise training of deep structured models for semantic segmentation') [2015]
SegNet ('SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation') [2016]
FCN ('Fully Convolutional Networks for Semantic Segmentation') [2016]
ENet ('ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation') [2016]
DilatedNet ('MULTI-SCALE CONTEXT AGGREGATION BY DILATED CONVOLUTIONS') [2016]
PixelNet ('PixelNet: Towards a General Pixel-Level Architecture') [2016]
RefineNet ('RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation') [2016]
LRR ('Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation') [2016]
FRRN ('Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes') [2016]
Semantic Segmentation using Adversarial Networks ('Semantic Segmentation using Adversarial Networks') [2016]
MultiNet ('MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving') [2016]
DeepLab ('DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs') [2017]
LinkNet ('LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation') [2017]
Tiramisu ('The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation') [2017]
ICNet ('ICNet for Real-Time Semantic Segmentation on High-Resolution Images') [2017]
ERFNet ('Efficient ConvNet for Real-time Semantic Segmentation') [2017]
PSPNet ('Pyramid Scene Parsing Network') [2017]
GCN ('Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network') [2017]
Segaware ('Segmentation-Aware Convolutional Networks Using Local Attention Masks') [2017]
PixelDCN ('PIXEL DECONVOLUTIONAL NETWORKS') [2017]
DeepLabv3 ('Rethinking Atrous Convolution for Semantic Image Segmentation') [2017]
DUC, HDC ('Understanding Convolution for Semantic Segmentation') [2018]
ShuffleSeg ('SHUFFLESEG: REAL-TIME SEMANTIC SEGMENTATION NETWORK') [2018]
AdaptSegNet ('Learning to Adapt Structured Output Space for Semantic Segmentation') [2018]
TuSimple-DUC ('Understanding Convolution for Semantic Segmentation') [2018]
R2U-Net ('Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation') [2018]
Attention U-Net ('Attention U-Net: Learning Where to Look for the Pancreas') [2018]
DANet ('Dual Attention Network for Scene Segmentation') [2018]
ENCNet ('Context Encoding for Semantic Segmentation') [2018]
ShelfNet ('ShelfNet for Real-time Semantic Segmentation') [2018]
LadderNet ('LADDERNET: MULTI-PATH NETWORKS BASED ON U-NET FOR MEDICAL IMAGE SEGMENTATION') [2018]
ССС ('Concentrated-Comprehensive Convolutions for lightweight semantic segmentation') [2018]
DifNet ('DifNet: Semantic Segmentation by Diffusion Networks') [2018]
BiSeNet ('BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation') [2018]
ESPNet ('ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation') [2018]
SPADE ('Semantic Image Synthesis with Spatially-Adaptive Normalization') [2019]
SeamlessSeg ('Seamless Scene Segmentation') [2019]
EMANet ('Expectation-Maximization Attention Networks for Semantic Segmentation') [2019]

Model	PASCAL-Context	Cityscapes (mIOU)	PASCAL VOC 2012 (mIOU)	COCO Stuff	ADE20K VAL (mIOU)
U-Net	?	?	?	?	?
DeconvNet	?	?	72.5	?	?
ParseNet	40.4	?	69.8	?	?
Piecewise	43.3	71.6	78.0	?	?
SegNet	?	56.1	?	?	?
FCN	37.8	65.3	62.2	22.7	29.39
ENet	?	58.3	?	?	?
DilatedNet	?	?	67.6	?	32.31
PixelNet	?	?	69.8	?	?
RefineNet	47.3	73.6	83.4	33.6	40.70
LRR	?	71.8	79.3	?	?
FRRN	?	71.8	?	?	?
MultiNet	?	?	?	?	?
DeepLab	45.7	64.8	79.7	?	?
LinkNet	?	?	?	?	?
Tiramisu	?	?	?	?	?
ICNet	?	70.6	?	?	?
ERFNet	?	68.0	?	?	?
PSPNet	47.8	80.2	85.4	?	44.94
GCN	?	76.9	82.2	?	?
Segaware	?	?	69.0	?	?
PixelDCN	?	?	73.0	?	?
DeepLabv3	?	?	85.7	?	?
DUC, HDC	?	77.1	?	?	?
ShuffleSeg	?	59.3	?	?	?
AdaptSegNet	?	46.7	?	?	?
TuSimple-DUC	80.1	?	83.1	?	?
R2U-Net	?	?	?	?	?
Attention U-Net	?	?	?	?	?
DANet	52.6	81.5	?	39.7	?
ENCNet	51.7	75.8	85.9	?	44.65
ShelfNet	48.4	75.8	84.2	?	?
LadderNet	?	?	?	?	?
CCC-ERFnet	?	69.01	?	?	?
DifNet-101	45.1	?	73.2	?	?
BiSeNet(Res18)	?	?	74.7	28.1	?
ESPNet	?	?	63.01	?	?
SPADE	?	62.3	?	37.4	38.5
SeamlessSeg	?	77.5	?	?	?
EMANet	?	?	88.2	39.9	?

Detection models

[R-CNN] Rich feature hierarchies for accurate object detection and semantic segmentation | Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik | [CVPR' 14] |[pdf] [official code - caffe] [2014]
[OverFeat] OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks | Pierre Sermanet, et al. | [ICLR' 14] |[pdf] [official code - torch] [2014]
[MultiBox] Scalable Object Detection using Deep Neural Networks | Dumitru Erhan, et al. | [CVPR' 14] |[pdf] [2014]
[SPP-Net] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition | Kaiming He, et al. | [ECCV' 14] |[pdf] [official code - caffe] [unofficial code - keras] [unofficial code - tensorflow] [2014]
[MR-CNN] Object detection via a multi-region & semantic segmentation-aware CNN model | Spyros Gidaris, Nikos Komodakis | [ICCV' 15] |[pdf] [official code - caffe] [2015]
[DeepBox] DeepBox: Learning Objectness with Convolutional Networks | Weicheng Kuo, Bharath Hariharan, Jitendra Malik | [ICCV' 15] |[pdf] [official code - caffe] [2015]
[AttentionNet] AttentionNet: Aggregating Weak Directions for Accurate Object Detection | Donggeun Yoo, et al. | [ICCV' 15] |[pdf] [2015]
[Fast R-CNN] Fast R-CNN | Ross Girshick | [ICCV' 15] |[pdf] [official code - caffe] [2015]
[DeepProposal] DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers | Amir Ghodrati, et al. | [ICCV' 15] |[pdf] [official code - matconvnet] [2015]
[Faster R-CNN, RPN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | Shaoqing Ren, et al. | [NIPS' 15] |[pdf] [official code - caffe] [unofficial code - tensorflow] [unofficial code - pytorch]
[YOLO v1] You Only Look Once: Unified, Real-Time Object Detection | Joseph Redmon, et al. | [CVPR' 16] |[pdf] [official code - c] [2016]
[G-CNN] G-CNN: an Iterative Grid Based Object Detector | Mahyar Najibi, et al. | [CVPR' 16] |[pdf] [2016]
[AZNet] Adaptive Object Detection Using Adjacency and Zoom Prediction | Yongxi Lu, Tara Javidi. | [CVPR' 16] |[pdf] [2016]
[ION] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks | Sean Bell, et al. | [CVPR' 16] |[pdf] [2016]
[HyperNet] HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection | Tao Kong, et al. | [CVPR' 16] |[pdf] [2016]
[OHEM] Training Region-based Object Detectors with Online Hard Example Mining | Abhinav Shrivastava, et al. | [CVPR' 16] |[pdf] [official code - caffe] [2016]
[CRAPF] CRAFT Objects from Images | Bin Yang, et al. | [CVPR' 16] |[pdf] [official code - caffe] [2016]
[MPN] A MultiPath Network for Object Detection | Sergey Zagoruyko, et al. | [BMVC' 16] |[pdf] [official code - torch] [2016]
[SSD] SSD: Single Shot MultiBox Detector | Wei Liu, et al. | [ECCV' 16] |[pdf] [official code - caffe] [unofficial code - tensorflow] [unofficial code - pytorch] [2016]
[GBDNet] Crafting GBD-Net for Object Detection | Xingyu Zeng, et al. | [ECCV' 16] |[pdf] [official code - caffe] [2016]
[CPF] Contextual Priming and Feedback for Faster R-CNN | Abhinav Shrivastava and Abhinav Gupta | [ECCV' 16] |[pdf] [2016]
[MS-CNN] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection | Zhaowei Cai, et al. | [ECCV' 16] |[pdf] [official code - caffe] [2016]
[R-FCN] R-FCN: Object Detection via Region-based Fully Convolutional Networks | Jifeng Dai, et al. | [NIPS' 16] |[pdf] [official code - caffe] [unofficial code - caffe] [2016]
[PVANET] PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection | Kye-Hyeon Kim, et al. | [NIPSW' 16] |[pdf] [official code - caffe] [2016]
[DeepID-Net] DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection | Wanli Ouyang, et al. | [PAMI' 16] |[pdf] [2016]
[NoC] Object Detection Networks on Convolutional Feature Maps | Shaoqing Ren, et al. | [TPAMI' 16] |[pdf]
[DSSD] DSSD : Deconvolutional Single Shot Detector | Cheng-Yang Fu1, et al. | [arXiv' 17] |[pdf] [official code - caffe] [2017]
[TDM] Beyond Skip Connections: Top-Down Modulation for Object Detection | Abhinav Shrivastava, et al. | [CVPR' 17] |[pdf] [2017]
[FPN] Feature Pyramid Networks for Object Detection | Tsung-Yi Lin, et al. | [CVPR' 17] |[pdf] [unofficial code - caffe] [2017]
[YOLO v2] YOLO9000: Better, Faster, Stronger | Joseph Redmon, Ali Farhadi | [CVPR' 17] |[pdf] [official code - c] [unofficial code - caffe] [unofficial code - tensorflow] [unofficial code - tensorflow] [unofficial code - pytorch] [2017]
[RON] RON: Reverse Connection with Objectness Prior Networks for Object Detection | Tao Kong, et al. | [CVPR' 17] |[pdf] [official code - caffe] [unofficial code - tensorflow] [2017]
[DCN] Deformable Convolutional Networks | Jifeng Dai, et al. | [ICCV' 17] |[pdf] [official code - mxnet] [unofficial code - tensorflow] [unofficial code - pytorch] [2017]
[DeNet] DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling | Lachlan Tychsen-Smith, Lars Petersson | [ICCV' 17] |[pdf] [official code - theano] [2017]
[CoupleNet] CoupleNet: Coupling Global Structure with Local Parts for Object Detection | Yousong Zhu, et al. | [ICCV' 17] |[pdf] [official code - caffe] [2017]
[RetinaNet] Focal Loss for Dense Object Detection | Tsung-Yi Lin, et al. | [ICCV' 17] |[pdf] [official code - keras] [unofficial code - pytorch] [unofficial code - mxnet] [unofficial code - tensorflow] [2017]
[Mask R-CNN] Mask R-CNN | Kaiming He, et al. | [ICCV' 17] |[pdf] [official code - caffe2] [unofficial code - tensorflow] [unofficial code - tensorflow] [unofficial code - pytorch] [2017]
[DSOD] DSOD: Learning Deeply Supervised Object Detectors from Scratch | Zhiqiang Shen, et al. | [ICCV' 17] |[pdf] [official code - caffe] [unofficial code - pytorch] [2017]
[SMN] Spatial Memory for Context Reasoning in Object Detection | Xinlei Chen, Abhinav Gupta | [ICCV' 17] |[pdf] [2017]
[YOLO v3] YOLOv3: An Incremental Improvement | Joseph Redmon, Ali Farhadi | [arXiv' 18] |[pdf] [official code - c] [unofficial code - pytorch] [unofficial code - pytorch] [unofficial code - keras] [unofficial code - tensorflow] [2018]
[ZIP] Zoom Out-and-In Network with Recursive Training for Object Proposal | Hongyang Li, et al. | [IJCV' 18] |[pdf] [official code - caffe] [2018]
[SIN] Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships | Yong Liu, et al. | [CVPR' 18] |[pdf] [official code - tensorflow] [2018]
[STDN] Scale-Transferrable Object Detection | Peng Zhou, et al. | [CVPR' 18] |[pdf]
[RefineDet] Single-Shot Refinement Neural Network for Object Detection | Shifeng Zhang, et al. | [CVPR' 18] |[pdf] [official code - caffe] [unofficial code - chainer] [unofficial code - pytorch] [2018]
[MegDet] MegDet: A Large Mini-Batch Object Detector | Chao Peng, et al. | [CVPR' 18] |[pdf] [2018]
[DA Faster R-CNN] Domain Adaptive Faster R-CNN for Object Detection in the Wild | Yuhua Chen, et al. | [CVPR' 18] |[pdf] [official code - caffe] [2018]
[SNIP] An Analysis of Scale Invariance in Object Detection – SNIP | Bharat Singh, Larry S. Davis | [CVPR' 18] |[pdf] [2018]
[Relation-Network] Relation Networks for Object Detection | Han Hu, et al. | [CVPR' 18] |[pdf] [official code - mxnet] [2018]
[Cascade R-CNN] Cascade R-CNN: Delving into High Quality Object Detection | Zhaowei Cai, et al. | [CVPR' 18] |[pdf] [official code - caffe] [2018]
Finding Tiny Faces in the Wild with Generative Adversarial Network | Yancheng Bai, et al. | [CVPR' 18] |[pdf] [2018]
[STDnet] STDnet: A ConvNet for Small Target Detection | Brais Bosquet, et al. | [BMVC' 18] |[pdf] [2018]
[RFBNet] Receptive Field Block Net for Accurate and Fast Object Detection | Songtao Liu, et al. | [ECCV' 18] |[pdf] [official code - pytorch] [2018]
Zero-Annotation Object Detection with Web Knowledge Transfer | Qingyi Tao, et al. | [ECCV' 18] |[pdf] [2018]
[CornerNet] CornerNet: Detecting Objects as Paired Keypoints | Hei Law, et al. | [ECCV' 18] |[pdf] [official code - pytorch] [2018]
[Pelee] Pelee: A Real-Time Object Detection System on Mobile Devices | Jun Wang, et al. | [NIPS' 18] |[pdf] [official code - caffe] [2018]
[HKRM] Hybrid Knowledge Routed Modules for Large-scale Object Detection | ChenHan Jiang, et al. | [NIPS' 18] |[pdf] [2018]
[MetaAnchor] MetaAnchor: Learning to Detect Objects with Customized Anchors | Tong Yang, et al. | [NIPS' 18] |[pdf] [2018]
[M2Det] M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network | Jun Wang, et al. | [AAAI' 19] |[pdf] [2018]
[Libra RetinaNet] Libra R-CNN: Towards Balanced Learning for Object Detection | Jiangmiao Pang, et al. | [pdf] [2019]
[YOLACT-700] YOLACT Real-time Instance Segmentation | [pdf] [2019]
[DetNASNet] DetNAS: Backbone Search for Object Detection | [pdf] [2019]

Detector	VOC07 (mAP@IoU=0.5)	VOC12 (mAP@IoU=0.5)	COCO (mAP)
R-CNN	58.5	-	-
OverFeat	-	-	-
MultiBox	29.0	-	-
SPP-Net	59.2	-	-
MR-CNN	78.2	73.9	-
AttentionNet	-	-	-
Fast R-CNN	70.0	68.4	-
Faster R-CNN	73.2	70.4	36.8
YOLO v1	66.4	57.9	-
G-CNN	66.8	66.4	-
AZNet	70.4	-	22.3
ION	80.1	77.9	33.1
HyperNet	76.3	71.4	-
OHEM	78.9	76.3	22.4
MPN	-	-	33.2
SSD	76.8	74.9	31.2
GBDNet	77.2	-	27.0
CPF	76.4	72.6	-
MS-CNN	-	-	-
R-FCN	79.5	77.6	29.9
PVANET	-	-	-
DeepID-Net	69.0	-	-
NoC	71.6	68.8	27.2
DSSD	81.5	80.0	-
TDM	-	-	37.3
FPN	-	-	36.2
YOLO v2	78.6	73.4	21.6
RON	77.6	75.4	-
DCN	-	-	-
DeNet	77.1	73.9	33.8
CoupleNet	82.7	80.4	34.4
RetinaNet	-	-	39.1
Mask R-CNN	-	-	39.8
DSOD	77.7	76.3	-
SMN	70.0	-	-
YOLO v3	-	-	33.0
SIN	76.0	73.1	23.2
STDN	80.9	-	-
RefineDet	83.8	83.5	41.8
MegDet	-	-	-
RFBNet	82.2	-	-
CornerNet	-	-	42.1
LibraRetinaNet	-	-	43.0
YOLACT-700	-	-	31.2
DetNASNet(3.8)	-	-	42.0

Trevol / awesome-computer-vision-models

awesome-computer-vision-models

Classification models

Segmentation models

Semantic segmentation

Detection models

About