leitro / simplest_seq2seq

The simplest seq2seq implementation based on encoder-decoder using TensorFlow 1.3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

simplest_seq2seq

The simplest seq2seq implementation based on encoder-decoder using TensorFlow v1.3

Software Environment:

  • Ubuntu 16.04 x64
  • Python 3.5
  • TensorFlow 1.3

What is it:

It is the simplest implementation for sequence to sequence model, and the fake datasets are generated by generateData.py which have vocabulary from a to z. The core idea is to predict a variable length target sequence given an input variable length source sequence. The architecture is showed as below:

Figure 1. Architecture of the model

How to play:

  1. Run generateData.py to create random fake datasets: vocab.dat, input.dat, output.dat and pred_logs/groundtruth.dat. (Note: groundtruth.dat is the last 10% of the output.dat by default)
  2. Run encoderDecoder.py for training and generating predicted sequences in the folder pred_logs.
  3. Go into the folder pred_logs and run pytasas.py which will create a CER log file test_cer_tasas.log, then run drawCER.py to visualize the CER results.

How to calculate CER:

In my work, I use a third part command line tool https://github.com/mauvilsa/htrsh to calculate character error rate, you can also use TensorFlow's build-in edit_distance function to do it.

Result:

Figure 2. CER of the testing datasets

There is something important to note here, when training the seq2seq model, the input sequence for decoder should be the groundtruth, but when testing the model with testing datasets, the input sequence for decoder should be generated by itself iteratively. But this method here is the simplest way, in my future work I will give Scheduled Sampling a try.

Update note:

encoderDecoder.py uses the legacy API in TF, if you want to try new TF features, just run encoderDecoder_newAPI.py instead, but all the information will only be printed on the screen.

The architecture of model with Bahdanau attention is:

Figure 3. Model with Bahdanau attention

But in encoderDecoder_newAPI.py, I replace the BLSTM with normal GRU for the encoder part to make it as simple as possible.

TODO:

  • Switch from legacy API to higher level wrappers
  • Add attention machanism

About

The simplest seq2seq implementation based on encoder-decoder using TensorFlow 1.3


Languages

Language:Python 98.9%Language:Shell 1.1%