Dynamic Memory Networks in Tensorflow

Implementation of Dynamic Memory Networks for Visual and Textual Question Answering on the bAbI question answering tasks using Tensorflow.

Prerequisites

Python 3.x
Tensorflow 0.8+
Numpy
tqdm - Progress bar module

Usage

First, You need to install dependencies.

sudo pip install tqdm
git clone https://github.com/therne/dmn-tensorflow & cd dmn-tensorflow

Then download the dataset:

mkdir data
curl -O http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz
tar -xzf tasks_1-20_v1-2.tar.gz -C data/

If you want to run original DMN (models/old/dmn.py), you also need to download GloVe word embedding data.

curl -O http://nlp.stanford.edu/data/glove.6B.zip
unzip glove.6B.zip -d data/glove/

Training the model

./main.py --task [bAbi Task Number]

Testing the model

./main.py --test --task [Task Number]

Results

Single run of DMN+ model trained with paper settings (Batch 128, 3 episodes, 80 hidden, dropout rate 0.9, L2) + batch normalization. The skipped tasks achieved 0 error.

Task	Error Rate

Two supporting facts | 27.2%
Three supporting facts | -
Two arguments relations | 23.4%
Three arguments relations | 1.1%
List/Sets | 0.4%
Compound coreference | 1.5%
Time reasoning | 0.8%
Basic induction | 65.9%
Positional reasoning | 19.2%
Size reasoning | 8.7%
Path finding | 69.9% Average | 10.9%

Overfitting occurs in some tasks and error rate is higher than the paper's result. I think we need some additional regularizations.

References

Implementing Dynamic memory networks by YerevaNN - Great article that helped me a lot
Dynamic-memory-networks-in-Theano

To-do

More regularizations and hyperparameter tuning
Visual question answering
Attention visualization
Interactive mode?

About

Dynamic Memory Networks (https://arxiv.org/abs/1603.01417) in Tensorflow

Languages

Language:Python 99.5%Language:Shell 0.5%