This is an unofficial PyTorch re-implementation of paper "Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network" published in ICCV 2019, with PyTorch >= v1.4.0.
- Backbone model
- FPEM model
- FFM model
- Integrated model
- Loss Function
- Data preprocessing
- Data postprocessing
- Training pipeline
- Inference pipeline
- Evaluation pipeline
python train.py --batch 32 --epoch 5000 --dataset_type ctw --gpu True
python inference.py --input ./data/CTW1500/test/text_image --model ./outputs/model_epoch_0.pth --bbox_type poly
Model | Precision | Recall | F score | FPS (CPU) + pa.py | FPS (1 GPU) + pa.py | FPS (1 GPU) + pa.pyx |
---|---|---|---|---|---|---|
PAN-640 | 0.8509 | 0.7927 | 0.8208 | 0.3493 | 4.6347 | 21.167 |
Model | Precision | Recall | F score | FPS (CPU) + pa.py | FPS (1 GPU) + pa.py | FPS (1 GPU) + pa.pyx |
---|---|---|---|---|---|---|
PAN-640 | 0.9011 | 0.8040 | 0.8498 | 0.2883 | 7.6481 | 20.390 |
- CTW1500: https://github.com/Yuliang-Liu/Curve-Text-Detector
- Total-Text: https://github.com/cs-chan/Total-Text-Dataset
- SynthText: https://www.robots.ox.ac.uk/~vgg/data/scenetext/
- MSRA-TD500: http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)
- ICDAR-2015: https://rrc.cvc.uab.es/
[1] Original paper: https://arxiv.org/abs/1908.05900
[2] Official PyTorch code: https://github.com/whai362/pan_pp.pytorch