castorini / VDPWI-NN-Torch

Very Deep Pairwise Word Interaction Neural Networks for modeling textual similarity (He and Lin, NAACL/HLT 2016)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Very-Deep Pairwise Word Interaction Neural Networks for Modeling Textual Similarity

NOTE: This repo contains code for the original Torch implementation from the NAACL 2016 paper. The code is not being maintained anymore and has been superseded by a PyTorch reimplementation in Castor. This repo exists solely for archival purposes.

This repo contains the Torch implementation of the very-deep pairwise word interaction neural network for modeling textual similarity, as described in the following paper:

This model does not require external resources such as WordNet or parsers, does not use sparse features, and achieves good accuracy on standard public datasets.

Installation and Dependencies

  • Please install Torch deep learning library. We recommend this local installation which includes all required packages our tool needs, simply follow the instructions here: https://github.com/torch/distro

  • Currently our tool only runs on CPUs, therefore it is recommended to use INTEL MKL library (or at least OpenBLAS lib) so Torch can run much faster on CPUs.

  • Our tool then requires Glove embeddings by Stanford. Please run fetch_and_preprocess.sh for downloading and preprocessing this data set (around 3 GBs).

Running

  • Command to run (training, tuning and testing all included):
  • th trainSIC.lua

The tool will output pearson scores and also write the predicted similarity scores given each pair of sentences from test data into predictions directory.

About

Very Deep Pairwise Word Interaction Neural Networks for modeling textual similarity (He and Lin, NAACL/HLT 2016)


Languages

Language:Lua 69.6%Language:Python 18.7%Language:Perl 8.0%Language:R 3.4%Language:Shell 0.3%