hanquansanren / Consistency_Regularization_STR

It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Consistency_Regularization_STR

It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022. Test in Python3.7.

Install the enviroment

    pip install -r requirements.txt

Data Prepare

Please convert your own dataset to LMDB format by create_dataset.py. (Borrowed from https://github.com/bgshih/crnn/blob/master/tool/create_dataset.py, provided by Baoguang Shi)

For labeled dataset, there are converted Synth90K and SynthText LMDB dataset by luyang-NWPU: [Here], password: tw3x

For unlabeled dataset, you could download raw images from imagenet, places2 and openimages, then detect and crop word images from these images.

Supervised Training

sh run_baseline.sh

Semi-Supervised Training

sh run_semi.sh

Testing

sh run_test.sh

Pretrained Models and Results

download pretrained model from here(https://pan.baidu.com/s/1JF97VY0oiPiK5GpDEsYioQ?pwd=abhr, password:abhr) and put them in saved_models.

Model Labeled data Unlabeled data IC13 857 IC13 1015 SVT IIIT IC15 1811 IC15 2077 SVTP CUTE
TRBA_pr 10% (Synth90K+SynthText) - 96.3 94.3 91.5 94.3 81.5 77.7 84.2 87.5
TRBA_cr 10% (Synth90K+SynthText) 1.06M unlabeled real data 97.2 95.9 94.3 96.6 86.8 79.7 89.0 93.4
TRBA_pr Synth90K + SynthText - 97.2 95.9 91.8 95.5 83.6 79.7 87.3 88.5
TRBA_cr Synth90K + SynthText 10.6M unlabeled real data 98.0 96.4 96.0 97.0 88.8 84.9 90.9 95.1

Acknowledgment

This code is based on STR-Fewer-Labels by Jeonghun Baek. Thanks for your contribution.

About

It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.

License:MIT License


Languages

Language:Python 98.9%Language:Shell 1.1%