cxncu001 / NLI

Models for Nature Language Inference (Tensorflow Version), including 'A Decomposable Attention Model for Natural Language Inference', ..., to be continued.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Models for Nature Language Inference (NLI)

We are trying to reproduce some classical models in literal papers for Nature Language Inferece, and report performance on the Stanford Natural Language Inference data set (SNLI).

Models

Environments

  • TensorFlow 1.3 or higher
  • Python 3.5
  • Numpy
  • Sklearn

Data preparation

nliutils.py can be used for data preparation.

  • build_vocab(): Build vocabulary according the training data.
  • load_vocab(): Load vocabulary from file.
  • convert_data(): Convert NLI data from 'JSON' format to the following 'TXT' format: gold_label ||| sentence1 ||| sentence2.
  • process_file(): Prepare data for model, including converting words into indexes according the vocabulary, padding sentences into fix length, creating the corresponding mask arrays, and loading the classification labels of data into a 1-D array.
  • batch_iter(): Generate a batch of data.
  • convert_embeddings(): Convert embeddings from TXT (one word embedding per line) to a easy-to-use format in Python, which consists of a 2-d numpy array for embeddings and a dictionary for vocabulary.
  • pre-trained word embeddings: You can download pre-trained word embeddings from GloVe, and use convert_embeddings() to get needed format in the code.

Hyper-parameters

  • decompose:

Train model: python3 decompose/train.py --embeddings ../../res/embeddings/glove.840B.300d.we --train_em 0 -op adagrad -lr 0.05 --require_improvement 50000000 --vocab ../cdata/snli/vocab.txt -ep 300 --normalize 1 -l2 0.0 -bs 4 --report 16000 --save_per_batch 16000 -cl 100

Test model: python3 decompose/test.py -m modelfile -d testdata

Results

Model Acc reported in papers Our Acc
decompose 86.3% 86.28%

About

Models for Nature Language Inference (Tensorflow Version), including 'A Decomposable Attention Model for Natural Language Inference', ..., to be continued.


Languages

Language:Python 100.0%