yingjieMiao / tiny_nmt

Toy neural machine translation with TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Toy Neural Machine Translation with TensorFlow

Introduction

This is an oversimplified, no-frill refactor of the tensorflow/nmt library. In just about 500 lines of code, it illustrates the core idea of encoder-decoder network, and provides an end-to-end training and inference example. A basic attention mechanism (Luong style) is also included as a training option.

Training and inference is based on synthetic data. Each source sentence is a sequence of numbers in {1, ..., 26}, and the target sentence is the sequence of corresponding ascii letters. For example, "3 2 7 26" translates to "c b g z". The source sentence is generated in two steps: 1) Sample sentence length n (e.g. randomly between 1 to 10); 2) Independently sample n numbers from {1, ..., 26}.

Run the code

Running the code is simple:

python train.py
tensorboard --logdir /tmp/tf/log/nmt/test/

With the default parameters, you should see some debug info after each epoch, which contains: 1) a few sampled (src, tgt, nmt) tuples; 2) the translation of each single number to letter.

Done with epoch 4. Global step = 400
src:  20
tgt:  t
nmt:  
src:  16 20 1 10 14 18
tgt:  p t a j n r
nmt:  n t n n n r
char prediction: a, , c, d, e, , g, g, l, c, , , , , , , , , , , u, d, , , ,  

...

Done with epoch 30. Global step = 3000
src:  25 9
tgt:  y i
nmt:  y i
src:  15 10 2 21 4 16 14 15 21
tgt:  o j b u d p n o u
nmt:  o j j r i r r o r
char prediction:  a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z

...

Done with epoch 100. Global step = 10000
src:  8 22 17 11 11 21 8 21 5 8
tgt:  h v q k k u h u e h
nmt:  h v q k k q w u g a
src:  3 2 4 13 6 22 22
tgt:  c b d m f v v
nmt:  c b d m f v v
char prediction:  a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z

...

Done with epoch 180. Global step = 18000
src:  21 2 18 19 3 2 13 15 2
tgt:  u b r s c b m o b
nmt:  u b r s c b m o b
src:  10 19 18 12 1 10 17 17 25
tgt:  j s r l a j q q y
nmt:  j s r l a j q q y
char prediction:  a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z

The NMT model gradually learned a few things:

  • Correct mapping from single number in {1..26} to single letter in {a..z}.
  • Correct target sequence length.
  • Good translation of short sentences.
  • Good translation of longer sentences.

It would be interesting to look at the trained model parameters and analyze what happened under the hood.

Things not covered

A few important components that are not covered here (yet):

  • Real language dataset
  • Beam Search
  • Smarter learning rate scheduling
  • Perplexity and BLEU metrics

About

Toy neural machine translation with TensorFlow


Languages

Language:Python 100.0%