LISA: Linguistically-Informed Self-Attention

This is a work-in-progress, but much-improved, re-implementation of the linguistically-informed self-attention (LISA) model described in the following paper:

Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, and Andrew McCallum. Linguistically-Informed Self-Attention for Semantic Role Labeling. Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, Belgium. October 2018.

To exactly replicate the results in the paper at the cost of an unpleasantly hacky codebase, you can use the original LISA code here.

Requirements:

>= Python 3.6
>= TensorFlow 1.10

Quick start:

Data setup (CoNLL-2005):

Get pre-trained word embeddings (GloVe):

wget -P embeddings http://nlp.stanford.edu/data/glove.6B.zip
unzip -j embeddings/glove.6B.zip glove.6B.100d.txt -d embeddings

Get CoNLL-2005 data in the right format using this repo. Follow the instructions all the way through further preprocessing.
Make sure the correct data paths are set in config/conll05.conf

Train a model:

To train a model with save directory model using the configuration conll05-lisa.conf:

bin/train.sh config/conll05-lisa.conf --save_dir model

Evaluate a model:

To evaluate the latest checkpoint saved in the directory model:

bin/evaluate.sh config/conll05-lisa.conf --save_dir model

About

Linguistically-Informed Self-Attention implemented in TensorFlow

Apache License 2.0

Languages

Language:Python 57.1%Language:Perl 42.3%Language:Shell 0.6%