TorchCV: a PyTorch vision library mimics ChainerCV

Detection

Model	Original Paper	ChainerCV	TorchCV*
SSD300@voc07_test	74.3%	77.8%	76.68%
SSD512@voc07_test	76.8%	79.2%	78.89%
FPNSSD512@voc07_test	-	-	81.46%

The accuracy of TorchCV SSD is ~1% lower than ChainerCV. This is because the VGG base model I use performs slightly worse.
I did the experiment by replacing pytorch/vision VGG16 model with the model used in ChainerCV, the SSD512 model got 79.85% accuracy.

FPNSSD512 is created by replacing SSD VGG16 network with FPN50, the rest is the same. It beats all SSD models.
You can download the trained params here.

Update

[2018-2-6] Our FPNSSD512 model achieved the 1st place on the PASCAL VOC 2012 dataset.

Check the leaderboard.

[2018-2-26] As issue(#11) mentioned I shouldn't use VOC07 data for training. I submit another result that is only trained on VOC12 data. The older submission is already marked to private.

[2018-3-29] As Alibaba Turing Lab submit a result of 74.8% MAP, which takes the first place on Comp3, I decided to train a deeper model (replace FPN50 with FPN152, trained only with VOC12 data).
It got MAP of 77%, which is far more higher than I expected.
Check the new leaderboard. The older submission is marked to private.

TODO

About

TorchCV: a PyTorch vision library mimics ChainerCV

MIT License

Languages

Language:Python 100.0%