tonyjo / multi_digit_classification_attention

Using attention for sequence classification for multi-character prediction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-character Prediction Using Attention

Setup

pip install -r requirements.txt

Codebase developed on Python-2.7.15

Multi-Digit classification - (SVHN dataset)

Preparing data

We need to prepare the raw SVHN dataset

  1. Go to the cloned multi_digit_classification_attention folder and run the following command:
cd SVHN
mkdir dataset
  1. Download the SVHN dataset and extract the train and test SVHN data into the dataset folder inside multi_digit_classification_attention folder.

  2. Select which type to data to curate and run the following command:

python gen_crop_dataset.py --dataset_type=<train/test>
  1. Select which type to data to generate attention mask and run the following command:
python gen_attn_truth.py --dataset_type=<train/test>

Training

  1. To train detection model run the following command:
python train.py
  1. To train classification model run the following command:
python train_classify_net.py

Inference

To test and visualize results run the following command:

jupyter notebook

and open and run:

> evaluate_and_viz.ipynb

End-2-End Learning for detection and classification

  1. To train full model for both detection and classification run the following command:
python train_end2end.py
  1. To test and visualize results run the following command:
jupyter notebook

and open and run:

> evaluate_and_viz_end2end.ipynb

CAPTCHA classification - (CAPTCHA dataset)

Preparing data

We need to generate raw CAPTCHA dataset

  1. Go to the cloned multi_digit_classification_attention folder and run the following command:
cd other/
mkdir dataset
  1. Generate dataset by running the following command:
python gen_captcha_dataset.py
  1. Move the generated dataset into CAPTCHA folder.

Training

  1. To train detection model run the following command:
python train.py
  1. To train classification model run the following command:
python train_classify_net.py

Inference

To test and visualize results run the following command:

jupyter notebook

and open and run:

> evaluate_and_viz.ipynb

End-2-End Learning for detection and classification

  1. To train full model for both detection and classification run the following command:
python train_end2end.py
  1. To test and visualize results run the following command:
jupyter notebook

and open and run:

> evaluate_and_viz_end2end.ipynb

Contribution guidelines

Any improvements would be appreciated, send a merge request if you would like to contribute.

About

Using attention for sequence classification for multi-character prediction

License:MIT License


Languages

Language:Jupyter Notebook 94.3%Language:Python 5.7%