arshjot / ScrabbleGAN

Handwritten Text Generation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ScrabbleGAN - Handwritten Text Generation

A PyTorch implementation of the ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation paper. Parts of the code have been adapted from the official implementation of the paper. The purpose of this repository is to provide a clear and simple way to understand and replicate the results of the paper.

Image generated by our trained model

Requirements

  • PyTorch v1.6.0 - for all the deep learning components
  • PyTorch-FID - for FID score calculation
  • OpenCV 3 - for image processing (not required for generating new images)

A complete requirements.txt file will be added soon.

Steps for training the ScrabbleGAN model from scratch

  1. Download the IAM dataset or the RIMES database and keep them in the data /data/ directory as shown below:

    ├── data
    |   ├── IAM
    |       └──ascii
    |           └──words.txt
    |       └──words
    |           └──a01
    |           └──a02
    |           .
    |           .
    |       └──original_partition
    |           └──te.lst, tr.lst, va1.lst, va2.lst
    |   ├── RIMES
    |       └──ground_truth_training_icdar2011.txt
    |       └──training
    |           └──lot_1
    |           └──lot_2
    |           .
    |           .
    |       └──ground_truth_validation_icdar2011.txt
    |       └──validation
    |           └──lot_14
    |           └──lot_15
    |           └──lot_16
    |       .
    |       .
    |   └── prepare_data.py 
  2. Modify the /config.py file to change dataset, model architecture , image height, etc. The default parameters indicate the ones used in the paper.

  3. From the data directory, run:

    python prepare_data.py

    This will process the ground-truth labels and images, and create a pickle file to be used for training.

  4. Start model training by running the below command from the main directory:

    python train.py

    This will start training the model. A sample generated image will be saved in the output directory after every epoch. Tensorboard logging has also been enabled.

Steps for generating new images

The easiest way to generate images is to use this demo; it has options for generating random text, specific text, random styles, consistent style, etc. Another option is to download these files:

  1. Pretrained models for English (IAM) or French (RIMES).
  2. Character mapping for English (IAM) or (French (RIMES).
  3. Lexicon files for English or French.

After downloading the required files, follow the below steps:

  1. Change the dataset and lexicon_file path in config.py.
  2. Run:
    python generate_images.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file'
    This will generate random images. You can also check the arguments in generate_images.py to see more options.

Steps to check FID score

Create the preprocessed data file as described in steps 1-3 of "Steps for training the ScrabbleGAN model from scratch". Also, either download the model checkpoints for English (IAM) or French (RIMES), or train your own model and save the checkpoints. To check the FID score, run: bash python calculate_metrics.py -c 'path_to_checkpoint_file'

Steps for training HTR models

One of the motivation in the paper was to boost the HTR performance using synthetic data generated by ScrabbleGAN. The code for HTR training has not been provided in this repository for consistency with the author's approach of using this code for HTR training. You can follow the below steps for HTR training:

  1. Create your own models or download all the files listed in "Steps for generating new images". Also, create the preprocessed data file as described in steps 1-3 of "Steps for training the ScrabbleGAN model from scratch".
  2. If required, change dataset, partition, data_file, lexicon_file in config.py
  3. To create LMDB data files required for HTR training, run:
    python create_lmdb_dataset.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file'
    to create lmdb dataset without any synthetic images, or
    python create_lmdb_dataset.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file' -n 100000
    to add generated images to the original dataset.
  4. Train the HTR model as described here

References

About

Handwritten Text Generation

License:MIT License


Languages

Language:Python 100.0%