Show, Attend and Read - A PyTorch Implementation
Implementation of Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition in AAAI 2019, with PyTorch >= v1.4.0.
Task
- Backbone model
- Encoder model
- Decoder model
- Integrated model
- Data processing
- Training pipeline
- Inference pipeline
Supported Dataset
- Street View Text: http://vision.ucsd.edu/~kai/svt/
- IIIT5K: https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset
- Syn90k: https://www.robots.ox.ac.uk/~vgg/data/text/
- SynthText: https://www.robots.ox.ac.uk/~vgg/data/scenetext/
Command
Training
python train.py --batch 32 --epoch 5000 --dataset ./svt --dataset_type svt --gpu True
Inference
python inference.py --batch 32 --input input_folder --model model_path --gpu True
Results
SVT
IIIT5K
Input:
Output attention map per character:
Syn90K (10k for training/3k for testing)
Input:
Output attention map per character:
SynthText (80k for training/20k for testing)
Input:
Output attention map per character:
Source
[1] Original paper: https://arxiv.org/abs/1811.00751
[2] Official code by the authors in torch: https://github.com/wangpengnorman/SAR-Strong-Baseline-for-Text-Recognition
[3] A TensorFlow implementation: https://github.com/Pay20Y/SAR_TF