AprilYapingZhang / Seq2SeqAdapt

Adversarial Sequence-to-sequence Domain Adaptation Network dubbed ASSDA for robust text image recognition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adversarial Sequence-to-sequence Domain adaptation

Overview

We propose a novel Adversarial Sequence-to-sequence Domain Adaptation Network dubbed ASSDA for robust text image recognition, which could adaptively transfer coarse global-level and fine-grained character-level knowledge.

Install

  1. This code is test in the environment with cuda==10.1, python==3.6.8.

  2. Install Requirements

pip3 install torch==1.2.0 pillow==6.2.1 torchvision==0.4.0 lmdb nltk natsort

Dataset

  • The prepared synthetic and real scene dataset can be downloaded from here, which are created by NAVER Corp.

  • The prepared handwritten text dataset can be downloaded from here

    • Handwritten text: IAM

Training and evaluation

  • For a toy example, you can download the pretrained model from here

    • Add model files to test into data/
  • Training model

    CUDA_VISIBLE_DEVICES=1 python train_da_global_local_selected.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \
    --src_train_data ./data/data_lmdb_release/training/ \
    --tar_train_data ./data/IAM/test --tar_select_data IAM --tar_batch_ratio 1 --valid_data ../data/IAM/test/ \
    --continue_model ./data/TPS-ResNet-BiLSTM-Attn.pth \
    --batch_size 128 --lr 1 \
    --experiment_name _adv_global_local_synth2iam_pc_0.1 --pc 0.1
    
  • Test model

    • Test the baseline model

      CUDA_VISIBLE_DEVICES=0 python test.py   --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn   \
       --eval_data ./data/IAM/test \
       --saved_model ./data/TPS-ResNet-BiLSTM-Attn.pth 
      
    • Test the adaptation model

      CUDA_VISIBLE_DEVICES=0 python test.py   --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn   \
      --eval_data ./data/IAM/test \
      --saved_model saved_models/TPS-ResNet-BiLSTM-Attn-Seed1111_adv_global_local_selected/best_accuracy.pth
      

Citation

If you use this code for a paper please cite:

@inproceedings{zhang2019sequence,
  title={Sequence-to-sequence domain adaptation network for robust text image recognition},
  author={Zhang, Yaping and Nie, Shuai and Liu, Wenju and Xu, Xing and Zhang, Dongxiang and Shen, Heng Tao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={2740--2749},
  year={2019}
}

@article{zhang2021robust,
  title={Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation},
  author={Zhang, Yaping and Nie, Shuai and Liang, Shan and Liu, Wenju},
  journal={IEEE Transactions on Image Processing},
  volume={30},
  pages={3922--3933},
  year={2021},
  publisher={IEEE}
}

Acknowledgement

This implementation has been based on this repository deep-text-recognition-benchmark

About

Adversarial Sequence-to-sequence Domain Adaptation Network dubbed ASSDA for robust text image recognition


Languages

Language:Python 100.0%