seq2seq-attention

Introduction

This code implements RNN/LSTM/GRU seq2seq and seq2seq+attention models for training and sampling in word-level. You can apply it in Bot, Auto Text Summarization, Machine Translation, Question Answer System etc. Here, we show you a bot demo.

Requirements

senna

This interface supports Part-of-speech tagging, Chunking, Name Entity Recognition and Semantic Role Labeling. It is used in sampling.

You can find how to install senna here

hdf5

It is a file format, the format is fast, flexible, and supported by a wide range of other software - including MATLAB, Python, and R.

You can find how to install hdf5 here

cutorch/cunn

If you want to run the code in GRU, you need to install cutorch and cunn.

[sudo] luarocks install cutorch

[sudo] luarocks install cunn

Dataset

We use the Wikipedia Talkpages Conversations Dataset as our corpus, to implement a conversation bot with it. After the download finished, ensure the data file in directory data/.

Run

Step 1 run the data preprocessing code, to generate the dataset file and vocabulary file.

python bot.py

If you want to do research with any other datasets or tasks, you may need to implement your preprocessing python script, then write the result data into the hdf5 file.

Step 2 run the training code.

th train.lua

Step 3 run the sampling code.

th test.lua

You can change the parameters or choose the model you need in the CmdLine() part of the code.

Step 4 run the bot server code.

th server.lua

Step 5 test the bot response through bot server.

curl http://localhost:8080/bot -d "input=<your input words>"

Acknowledge

Thanks for oxford deep learning course code and Karpathy's char-rnn code.

Contact

If you have any problems about it, you can make an issue directly or send me an email (mcgrady150318 at gmail.com/163.com). I will be glad to recieve your discussion about the code.

About

{% img /2016/05/13/Paper翻译列表/qrcode.jpg 350 350 %}

Zhihu Column

eriche2016 / seq2seq-attention

seq2seq-attention

Introduction

Requirements

Dataset

Run

Acknowledge

Contact

About

About

Languages