anhtu-phan / ocr-significance-testing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scene Text Telescope

This is the code for CVPR2021 paper "Scene Text Telescope: Text-Focused Scene Image Super-Resolution". [link]

architecture

Dependencies

Build up an environment with python3.6, and download corresponding libraries with pip

pip install -r requirement.txt

Pre-trained Model

Here are some outputs with backbone TBSRN while text-focused

[Model] [Log]

Dataset

Download all resources at BaiduYunDisk with password: stt6, or Dropbox

  • TextZoom dataset
  • Pretrained weights of CRNN
  • Pretrained weights of Transformer-based recognizer

All the resources shoulded be placed under ./dataset/mydata, for example

./dataset/mydata/train1
./dataset/mydata/train2
./dataset/mydata/pretrain_transformer.pth
...

Training

Please remember to modify the experiment name. Two text-focused modules are activated whenever --text_focus is used

CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus

Testing

CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus --resume YOUR_MODEL --test --test_data_dir ./dataset/mydata/test

Demo

CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus --demo --demo_dir ./demo

Acknowledgement

We inherited most of the frameworks from TextZoom and use the pretrained CRNN model from CRNN. Thanks for your contribution!

@JasonBoy1

@meijieru

Citation

@inproceedings{chen2021scene,
  title={Scene Text Telescope: Text-Focused Scene Image Super-Resolution},
  author={Chen, Jingye and Li, Bin and Xue, Xiangyang},
  booktitle={CVPR},
  pages={12026--12035},
  year={2021}
}

About


Languages

Language:Jupyter Notebook 56.4%Language:Python 43.6%