berlino / disco_em19

A Neural Two-Stage Approach for Recognizing Discontiguous Entities (EMNLP 2019)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Neural Two-Stage Approach for Recognizing Discontiguous Entities

Setup

PyTorch (tested on 0.4), Python(v3)

Code Structure

  • module/SegGraph.py: the segmental hypergraph for extracting segments
  • model/Coarse2Fine.py: the joint model that performs the segment extraction and merging
  • config.py: model and training configurations such as number of hidden units in LSTMs
  • train.py: performs training and testing

Data

We provide some processed sample data in the form of pkl files for demo. (data/examples.pkl and data/word_vec_200.pkl)

Note that we cannot distribute the data, the preprocessing scripts can be found from Aldrian's code.

Training and Testing

Simply run 'python train.py' will start the training process. We set the batch size to be 1 during training. After each epoch of training, the script outputs two kinds of metrics:

  • precision/recall/f1 on the segment extraction
  • precision/recall/f1 on final entity extraction.

The model that performs best on the development set will be selected to be evaluated on the test set at the end of the script.

About

A Neural Two-Stage Approach for Recognizing Discontiguous Entities (EMNLP 2019)

License:MIT License


Languages

Language:Python 100.0%