mengxiang / MaskTextSpotterV3

The code of "Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mask TextSpotter v3

This is a PyTorch implemntation of the ECCV 2020 paper Mask TextSpotter v3. Mask TextSpotter v3 is an end-to-end trainable scene text spotter that adopts a Segmentation Proposal Network (SPN) instead of an RPN. Mask TextSpotter v3 significantly improves robustness to rotations, aspect ratios, and shapes.

Relationship to Mask TextSpotter

Here we label the Mask TextSpotter series as Mask TextSpotter v1 (ECCV 2018 paper, code), Mask TextSpotter v2 (TPAMI paper, code), and Mask TextSpotter v3 (ECCV 2020 paper).

This project is under a lincense of Creative Commons Attribution-NonCommercial 4.0 International. Part of the code is inherited from Mask TextSpotter v2, which is under an MIT license.



  • Python3 (Python3.7 is recommended)
  • PyTorch >= 1.4 (1.4 is recommended)
  • cocoapi
  • yacs
  • matplotlib
  • GCC >= 4.9 (This is very important!)
  • OpenCV
  • CUDA >= 9.0 (10.0.130 is recommended)
  # first, make sure that your conda is setup properly with the right environment
  # for that, check that `which conda`, `which pip` and `which python` points to the
  # right path. From a clean conda env, this is what you need to do

  conda create --name masktextspotter -y
  conda activate masktextspotter

  # this installs the right pip and dependencies for the fresh python
  conda install ipython pip

  # python dependencies
  pip install ninja yacs cython matplotlib tqdm opencv-python shapely scipy tensorboardX pyclipper Polygon3 editdistance 

  # install PyTorch
  conda install pytorch torchvision cudatoolkit=10.0 -c pytorch


  # install pycocotools
  git clone
  cd cocoapi/PythonAPI
  python build_ext install

  # install apex
  git clone
  cd apex
  python install --cuda_ext --cpp_ext

  # clone repo
  git clone
  cd MaskTextSpotterV3

  # build
  python build develop



Download the trained model Google Drive, BaiduYun (downloading code: cnj2).

Option: Download the model pretrain with SynthText for your quick re-implementation. Google Drive, BaiduYun (downloading code: c82l).


You can run a demo script for a single image inference by python tools/


The datasets are the same as Mask TextSpotter v2.

Download the ICDAR2013(Google Drive, BaiduYun) and ICDAR2015(Google Drive, BaiduYun) as examples.

The SCUT dataset used for training can be downloaded here.

The converted labels of Total-Text dataset can be downloaded here.

The converted labels of SynthText can be downloaded here.

The root of the dataset directory should be MaskTextSpotterV3/datasets/.


Prepar dataset

An example of the path of test images: MaskTextSpotterV3/datasets/icdar2015/test_iamges

Check the config file (configs/finetune.yaml) for some parameters.

test dataset: TEST.DATASETS;

input size: ```INPUT.MIN_SIZE_TEST''';

model path: MODEL.WEIGHT;

output directory: OUTPUT_DIR

run sh


Place all the training sets in MaskTextSpotterV3/datasets/ and check DATASETS.TRAIN in the config file.


Trained with SynthText

python3 -m torch.distributed.launch --nproc_per_node=8 tools/ --config-file configs/pretrain/seg_rec_poly_fuse_feature.yaml


Trained with a mixure of SynthText, icdar2013, icdar2015, scut-eng-char, and total-text

check the initial weights in the config file.

python3 -m torch.distributed.launch --nproc_per_node=8 tools/ --config-file configs/mixtrain/seg_rec_poly_fuse_feature.yaml


Download lexicons

Google Drive, Baidu Drive ( downloading code: f3tk)

unzip and palce it like evaluation/lexicons/.

Evaluation for Total-Text dataset

cd evaluation/totaltext/e2e/
# edit "result_dir" in

Evaluation for the Rotated ICDAR 2013 dataset

First, generate the Rotated ICDAR 2013 dataset

cd tools
# set the specific rotating angle in

Then, run testing (change test set in YAML) and evaluate by evaluation/rotated_icdar2013/e2e/

Citing the related works

Please cite the related works in your publications if it helps your research:

  title={Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting},
  author={Liao, Minghui and Pang, Guan and Huang, Jing and Hassner, Tal and Bai, Xiang},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},

  author={M. {Liao} and P. {Lyu} and M. {He} and C. {Yao} and W. {Wu} and X. {Bai}},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title={Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes},

  title={Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes},
  author={Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},


The code of "Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting"



Language:Python 82.0%Language:Cuda 14.5%Language:C++ 3.5%Language:Shell 0.0%