lilujunai/rexnet

(NOTICE) Our paper has been accepted at CVPR 2021!! The submitted paper will be updated at arxiv!

(NOTICE) New models ReXNet-Lites which outperform EfficientNet-Lites will be uploaded soon!

ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network

Dongyoon Han, Sangdoo Yun, Byeongho Heo, and YoungJoon Yoo | Paper | Pretrained Models

AI LAB, NAVER Corp.

Abstract

This paper addresses representational bottleneck in a network and propose a set of design principles that improves model performance significantly. We argue that a representational bottleneck may happen in a network designed by a conventional design and results in degrading the model performance. To investigate the representational bottleneck, we study the matrix rank of the features generated by ten thousand random networks. We further study the entire layer's channel configuration towards designing more accurate network architectures. Based on the investigation, we propose simple yet effective design principles to mitigate the representational bottleneck. Slight changes on baseline networks by following the principle leads to achieving remarkable performance improvements on ImageNet classification. Additionally, COCO object detection results and transfer learning results on several datasets provide other backups of the link between diminishing representational bottleneck of a network and improving performance.

ReXNets vs. EfficientNets

Accuracy vs computational costs

Actual performance scores

The CPU latencies are tested on Xeon E5-2630_v4 with a single image and the GPU latencies iare measured on M40 PGUs with the batchsize of 64.

EfficientNets' scores are taken form arxiv v3 of the paper.

Model	Input Res.	Top-1 acc.	Top-5 acc.	FLOPs/params.	CPU Lat./ GPU Lat.
EfficientNet-B0	224x224	77.3	93.5	0.39B/5.3M	47ms/71ms
ReXNet_1.0	224x224	77.9	93.9	0.40B/4.8M	47ms/68ms

EfficientNet-B1	240x240	79.2	94.5	0.70B/7.8M	70ms/112ms
ReXNet_1.3	224x224	79.5	94.7	0.66B/7.6M	55ms/84ms

EfficientNet-B2	260x260	80.3	95.0	1.0B/9.2M	77ms/141ms
ReXNet_1.5	224x224	80.3	95.2	0.88B/9.7M	59ms/92ms

EfficientNet-B3	300x300	81.7	95.6	1.8B/12M	100ms/223ms
ReXNet_2.0	224x224	81.6	95.7	1.8B/19M	69ms/118ms

Model performances

ImageNet classification results

Please refer the following pretrained models. Top-1 and top-5 accuraies are reported with the computational costs.
Note that all the models are trained and evaluated with 224x224 image size.

Model	Input Res.	Top-1 acc.	Top-5 acc.	FLOPs/params
ReXNet_1.0	224x224	77.9	93.9	0.40B/4.8M
ReXNet_1.3	224x224	79.5	94.7	0.66B/7.6M
ReXNet_1.5	224x224	80.3	95.2	0.66B/7.6M
ReXNet_2.0	224x224	81.6	95.7	1.5B/16M
ReXNet_3.0	224x224	82.8	96.2	3.4B/34M

Finetuning results

COCO Object detection

The following results are trained with Faster RCNN with FPN:

Backbone	Img. Size	B_AP (%)	B_AP_0.5 (%)	B_AP_0.75 (%)	Params.	FLOPs	Eval. set
FBNet-C-FPN	1200x800	35.1	57.4	37.2	21.4M	119.0B	val2017
EfficientNetB0-FPN	1200x800	38.0	60.1	40.4	21.0M	123.0B	val2017
ReXNet_0.9-FPN	1200x800	38.0	60.6	40.8	20.1M	123.0B	val2017
ReXNet_1.0-FPN	1200x800	38.5	60.6	41.5	20.7M	124.1B	val2017

ResNet50-FPN	1200x800	37.6	58.2	40.9	41.8M	202.2B	val2017
ResNeXt-101-FPN	1200x800	40.3	62.1	44.1	60.4M	272.4B	val2017
ReXNet_2.2-FPN	1200x800	41.5	64.0	44.9	33.0M	153.8B	val2017

COCO instance segmentation

The following results are trained with Mask RCNN with FPN, S_AP and B_AP denote segmentation AP and box AP, respectively:

Backbone	Img. Size	S_AP (%)	S_AP_0.5 (%)	S_AP_0.75 (%)	B_AP (%)	B_AP_0.5 (%)	B_AP_0.75 (%)	Params.	FLOPs	Eval. set
EfficientNetB0_FPN	1200x800	34.8	56.8	36.6	38.4	60.2	40.8	23.7M	123.0B	val2017
ReXNet_0.9-FPN	1200x800	35.2	57.4	37.1	38.7	60.8	41.6	22.8M	123.0B	val2017
ReXNet_1.0-FPN	1200x800	35.4	57.7	37.4	38.9	61.1	42.1	23.3M	124.1B	val2017

ResNet50-FPN	1200x800	34.6	55.9	36.8	38.5	59.0	41.6	44.2M	207B	val2017
ReXNet_2.2-FPN	1200x800	37.8	61.0	40.2	42.0	64.5	45.6	35.6M	153.8B	val2017

Transfer learning results

Using ImageNet-pretrained models to transfer on the fine-grained datasets:

ReXNet-lites vs. EfficientNet-lites

Actual performance scores

We compare ReXNet-lites with EfficientNet-lites.

Model	Input Res.	Top-1 acc.	Top-5 acc.	FLOPs/params	CPU Lat./ GPU Lat.
EfficientNet-lite0	224x224	75.1	-	0.41B/4.7M	30ms/49ms
ReXNet-lite_1.0	224x224	76.2	92.8	0.41B/4.7M	31ms/49ms

EfficientNet-lite1	240x240	76.7	-	0.63B/5.4M	44ms/73ms
ReXNet-lite_1.3	224x224	77.8	93.8	0.65B/6.8M	36ms/61ms

EfficientNet-lite2	260x260	77.6	-	0.90B/ 6.1M	48ms/93ms
ReXNet-lite_1.5	224x224	78.6	94.2	0.84B/8.3M	39ms/68ms

EfficientNet-lite3	280x280	79.8	-	1.4B/ 8.2M	60ms/131ms
ReXNet-lite_2.0	224x224	80.2	95.0	1.5B/13M	49ms/90ms

Getting Started

Requirements

Python3
PyTorch (> 1.0)
Torchvision (> 0.2)
NumPy

Using the pretrained models

Usage is the same as the other models officially released in pytorch Torchvision.
Using models in GPUs:

import torch
import rexnetv1

model = rexnetv1.ReXNetV1(width_mult=1.0).cuda()
model.load_state_dict(torch.load('./rexnetv1_1.0x.pth'))
model.eval()
print(model(torch.randn(1, 3, 224, 224).cuda()))

For CPUs:

import torch
import rexnetv1

model = rexnetv1.ReXNetV1(width_mult=1.0)
model.load_state_dict(torch.load('./rexnetv1_1.0x.pth', map_location=torch.device('cpu')))
model.eval()
print(model(torch.randn(1, 3, 224, 224)))

Training own ReXNet

ReXNet can be trained with any PyTorch training codes including ImageNet training in PyTorch with the model file and proper arguments. Since the provided model file is not complicated, we simply convert the model to train a ReXNet in other frameworks like MXNet. For MXNet, we recommend MXnet-gluoncv as a training code.

Using PyTorch, we trained ReXNets with one of the popular imagenet classification code, rwightman's pytorch-image-models for more efficient training. After including ReXNet's model file into the training code, one can train ReXNet-1.0x with the following command line:

./distributed_train.sh 4 /imagenet/ --model rexnetv1 --rex-width-mult 1.0 --opt sgd --amp \
 --lr 0.5 --weight-decay 1e-5 \
 --batch-size 128 --epochs 400 --sched cosine \
 --remode pixel --reprob 0.2 --drop 0.2 --aa rand-m9-mstd0.5

License

This project is distributed under MIT license.

How to cite

@article{han2020rexnet,
    title={{ReXNet}: Diminishing Representational Bottleneck on Convolutional Neural Network
},
    author={Han, Dongyoon and Yun, Sangdoo and Heo, Byeongho and Yoo, YoungJoon},
    year={2020},
    journal={arXiv preprint arXiv:2007.00992},
}

lilujunai / rexnet

(NOTICE) Our paper has been accepted at CVPR 2021!! The submitted paper will be updated at arxiv!

(NOTICE) New models ReXNet-Lites which outperform EfficientNet-Lites will be uploaded soon!

ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network

Abstract

ReXNets vs. EfficientNets

Accuracy vs computational costs

Actual performance scores

Model performances

ImageNet classification results

Finetuning results

COCO Object detection

COCO instance segmentation

Transfer learning results

ReXNet-lites vs. EfficientNet-lites

Actual performance scores

Getting Started

Requirements

Using the pretrained models

Training own ReXNet

License

How to cite

About

Languages