xuxu116/pytorch-reid-lite

Overview

This codebase is for training/deploying models in pytorch (onnx), currently it provides basic protocols for model training, evaluation and deploying.

Features

Distillation. Rank1:0.942993/map:0.831319 taught by resnet-101 model
Spectral feature transform (rank1: 0.945071, map: 0.827155 w\o post-processing). (https://arxiv.org/abs/1811.11405).
PCB structure (https://arxiv.org/abs/1711.09349); Improved training strategy
GAN related person generator(unstable)
AM softmax & triplet loss
step-wise LR warm-up

Install Dependency

This code depends on pytorch v0.4 and torchvision, run the following command to install pytorch:

pip install --user torch==0.4 torchvision==0.2.1 tensorflow==1.8 tensorboardX lmdb -i https://pypi.douban.com/simple/

Model Training

To train a model, clone the repo, modify params.json as you need, and run train.py.

cd pytorch-reid-lite
# Modify params.json - specify your own working dir.
# sub_working_dir is optional
python train.py --operation start_train --config_path params.json --sub_working_dir SUB\_WORKING\_DIR\_NAME

On-the-Fly Evaluation:

You can enable on-the-fly automatic evaluation by setting "type" under "evaluation_params" key in params.json (default is "None"). If set, after each epoch the code will run your evaluation, and only saves the best-performing model.

The code currently supports "market_evaluate" for person-reid and "classification_evaluate" for image classification, but it is easy to extend this to support other evalutiaons (like LFW). All you need to do is create a new file - say "lfw_evaluate.py" in the evaluate folder, and expose a run_eval method which takes in your training config and returns your evaluation result. See evaluate/market_evalute.py for an example.

Offline evaluation

python evaluator.py eval_params.json

Tensorboard visualization

You can visualize your training progress with tensorboardX (a pytorch integration of Tensorboard for Tensorflow), the code generates an event file in your sub working dir, to run tensorboard, do so as you would when using Tensorflow:

cd ~/.local/bin
./tensorboard --logdir=YOUR_SUB_WORKING_DIR --port=YOUR_PORT

Tips summary

Which benefits:

PCB structure
PCB randomly update
batchnorm
random erasing, zero paddding crop
warm-up learning rate
global branch
small batchsize

Which might helps:

feature erasing
feature mask
tri-loss
balanced sampling
multi-gpu training (differs in BN layer)

Not working:

adam
am-softmax
bias in FC layer or BN

Baselines

backbone	imgSize	PCB	rank1	map	aug.	batchsize	comments
resnet-50	384*128	1536/6	0.628266	0.346756	mirro	64*1	classifier no bias, 60 epoch, decay per 40
resnet-50	384*128	1536/6	0.683492	0.411627	mirro	64*1	weight_decay from 4e-5 to 5e-4
resnet-50	384*128	1536/6	0.837886	0.620621	mirro	64*1	add dropout before PCB
resnet-50	384*128	1536/6	0.856888	0.640600	mirro	64*1	last_conv_stride=1
resnet-50	384*128	1536/6	0.920724	0.755717	mirro	64*1	add BN to pcb stripe
resnet-50	384*128	1536/6	0.921318	0.765050	mirro,RE	64*1	add BN to pcb stripe
resnet-50	384*128	1536/6	0.927553	0.776928	mirro,RE	64*1	add global branch
resnet-50	384*128	1536/6	0.926366	0.784323	mirro,RE	64*1	random erase 1 branch, wp
resnet-50	384*128	1536/6	0.928147	0.785333	mirro,RE	64*1	random erase 5 branch, wp
resnet-50	384*128	1536/6	0.929929	0.790466	mirro,RE	64*1	random erase 6 branch, wp
resnet-50	384*128	1536/6	0.929038	0.787618	mirro,RE	64*1	random erase 6 branch, wp, 32X2
resnet-50	384*128	1536/6	0.927850	0.782085	mirro,RE	64*1	random erase 6 branch, wp, 16X4
resnet-50	384*128	1536/6	0.928741	0.771841	mirro,RE	64*1	global branch m=0.1
resnet-50	384*128	1536/6	0.926960	0.777564	mirro,RE	64*1	global branch m=0.3, warm-up
resnet-50	384*128	1536/6	0.926069	0.764451	mirro,RE	64*1	global branch m=0.4, warm-up
resnet-50	384*128	1536/6	0.924287	0.777912	mirro,RE	64*1	global branch m=0.4, warm-up, mask
resnet-50	384*128	1536/6	0.920428	0.775502	mirro,RE	64*1	mask@global branch
resnet-50	384*128	1536/6	0.930523	0.783172	mirro,RE	64*1	change hue
resnet-50	384*128	1536/6	0.920724	0.768056	mirro	32*1	120 epoch, decay per 40, hue
resnet-50	256*128	1024/4	0.907957	0.731270	mirro	32*1	120 epoch, decay per 40
resnet-50	256*128	1024/4	0.907957	0.750186	mirro,RE	32*1	120 epoch, decay per 40

For following settings

PCB branchs = 6
batch_size = 64
image size h x w = 384 x 128

GPU memory usage:

9529MiB for last_conv_stride=1 (130 example/sec)
7155MiB for last_conv_stride=2 (170 example/sec)

add global branchs at resnet-stage-4(start from no relu and dropout, adaptiveMaxPool)

backbone	imgSize	PCB	rank1	map	aug.	bs	comments
resnet-50	384*128	1536+256	0.935273	0.802506	mirro,RE	64*1	no relu & dropout, global f erasing(RE)
resnet-50	384*128	1536+256	0.940321	0.818069	mirro,RE	64*1	padcrop_10
resnet-50	384*128	1536+256	0.935570	0.820962	mirro,RE	64*1	random erase 6 branch(RB)
resnet-50	384*128	1536+256	0.935570	0.821505	mirro,RE	64*1	dropout, without feature erasing
resnet-50	384*128	1536+256	0.937055	0.818202	mirro,RE	64*1	dropout, no_pcbRE, no f_RE
resnet-50	384*128	1536+256	0.937945	0.815731	mirro,RE	64*1	pcbFE0.3, no_pcbRE, no f_RE
resnet-50	384*128	1536+256	0.927257	0.793121	mirro,RE	64*1	mask@ all bracnchs, pcbRE
resnet-50	384*128	1536+256	0.940024	0.818851	mirro,RE	64*1	no f_RE, update max loss branch
resnet-50	384*128	1536+256	0.925178	0.807808	mirro,RE	32*2	pcb_s_triloss, no_mask, no_pcbRE
resnet-50	384*128	1536+256	0.932304	0.819861	mirro,RE	32*2	pcb_s_triloss m=0.16
resnet-50	384*128	1536+256	0.940618	0.826704	mirro,RE	32*2	g_triloss + pcb_g_triloss(soft)
resnet-50	384*128	1536+256	0.940618	0.831889	mirro,RE	32*2	g_tri + pcb_g_tri, m=0.16
resnet-50	384*128	1536+256	0.941211	0.835557	mirro,RE	32*2	g_tri + pcb_g_tri, pcbRB6
resnet-50	384*128	1536+256	0.943290	0.834388	mirro,RE	32*2	g_tri_0.16, pcbRB6
resnet-50	384*128	1536+256	0.939133	0.826529	mirro,RE	32*2	g_tri_0.16, pcbRB6+mask
resnet-50	384*128	1536+256	0.883314	0.739106	mirro,RE	32*2	g_tri_0.16, pcbRB6+am0.3s15
resnet-50	384*128	1536+256	0.917458	0.791989	mirro,RE	32*2	g_tri_0.16, pcbRB6+am0.3s0
resnet-50	384*128	1536+256	0.940024	0.828990	mirro,RE	32*2	g_tri_0.16, pcbRB6, no additional stage-4
resnet-50	384*128	1536+256	0.939727	0.830806	mirro,RE	48*3	g_tri_0.16, pcbRB6
resnet-50	384*128	1536+256	0.939133	0.829087	mirro,RE	32*2	g_tri_0.16, pcbRB6, BN_nobias

Conclusions:

global branch after stage-4 helps
AM-softmax still cause overfitting
Tri-loss only used in global features
Update each PCB branches randomly

backbone	imgSize	PCB	rank1	map	aug.	batchsize	comments
resnet-50	256*128	256*1	0.802553	0.601922	mirro	128*1	last_stride=1
resnet-50	256*128	256*1	0.869062	0.685709	mirro	128*1	add BN, Dropout after feature layer
resnet-50	256*128	256*1	0.867874	0.685979	mirro	128*1	cls no bias (not use)
resnet-50	256*128	256*1	0.893112	0.740011	mirro	32*1	add BN, Dropout after feature layer
resnet-50	256*128	256*1	0.898753	0.749818	mirro,RE	32*1	120 epoch, decay per 40
resnet-50	256*128	256*1	0.907660	0.763313	mirro,RE	32*1	warm-up before 20 epoch
resnet-50	256*128	256*1	0.923100	0.782874	mirro,RE	8*4	700+ epochs
resnet-50	256*128	256*1	0.931116	0.819774	mirro,RE	8*4	pad_zero_crop, no dropout
resnet-50	256*128	256*1	0.945071	0.827155	mirro,RE	8*4	st0.3, fine-tuned
resnet-50	256*128	256*1	0.948931	0.873448	mirro,RE	8*4	st0.3, post-proce 0.5/top50
resnet-50	256*128	256*1	0.900831	0.774981	mirro,RE	16*8	spectral st_0.5_norm, pad_6
resnet-50	256*128	256*1	0.922506	0.811486	mirro,RE	8*4	tri_m=0.16, pad_6
resnet-50	256*128	256*1	0.921912	0.801184	mirro,RE	16*2	tri_m=0.16, pad_6
resnet-50	256*128	256*1	0.905879	0.756945	mirro,RE	32*1	am=0.0
resnet-50	256*128	256*1	0.898753	0.756945	mirro,RE	32*1	am=0.0(w normalized)
resnet-50	256*128	256*1	0.895190	0.756697	mirro,RE	32*1	am=0.1
resnet-50	256*128	256*1	0.906473	0.774181	mirro,RE	32*1	Add feature mask
resnet-50	256*128	256*1	0.914786	0.788952	mirro,RE	32*1	Change hue(with mask)
resnet-50	256*128	256*1	0.896081	0.738212	mirro,RE	32*1	Crop 288*144
resnet-50	256*128	256*1	0.849169	0.673918	mirro	32*1	adam, epoch 20 lr decay
resnet-50	256*128	256*1	0.864014	0.679649	mirro	32*1	adam, epoch 40 lr decay
resnet-50	256*128	256*1	0.867874	0.704566	mirro	32*1	global_pool 2048d as feature