allanj / statnlp-core

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

StatNLP: Hypergraph-based Structured Prediction Framework

StatNLP structured prediction framework developed by StatNLP team provdies a way for NLP researchers to rapidly develop structured models including conditional random fields (CRF), structured perceptron, structured SVM, softmax-margin CRF as well as neural CRF with various inference strategies.

The theory behind is based on the unified view of structured prediction. Check out our tutorial in EMNLP 2017: A Unified Framework for Structured Prediction: From Theory to Practice. TThis framework is based on the concept of hypergraph, making it very general, which covers linear-chain graphical models -- such as HMM, Linear-chain CRFs, and semi-Markov CRFs, tree-based graphical models -- such as Tree CRFs for constituency parsing --, and many more.

Coupled with a generic training formulation based on the generalization of the inside-outside algorithm to acyclic hypergraphs, this framework supports the rapid prototyping and creation of novel graphical models, where users would just need to specify the graphical structure, and the framework will handle the training procedure. The neural component is integrated with Torch 7 package.

Existing Research

A number of research papers(Jie and Lu, 2018; Zou and Lu, 2018; Li and Lu, 2018; Muis and Lu, 2017; Amoualian et al., 2017; Lim et al., 2017; Jie et al., 2017; Li and Lu, 2017; Susanto and Lu, 2017; Lu et al., 2016; Muis and Lu, 2016a; Muis and Lu, 2016b; Lu, 2015; Lu and Roth, 2015;Lu, 2014;) in references below have successfully use our framework to produce their novel models.

Direct Usage

To compile, ensure Maven is installed, and simply do:

    mvn clean package

This will create a runnable JAR in the target/ directory. Running the JAR file directly without any parameter will show the help:

    java -jar target/statnlp-core-{VERSION}.jar

For example:

    java -jar target/statnlp-core-2017.1-SNAPSHOT.jar

The package comes with some predefined models, which you can directly use with your data:

Linear-chain CRF:

    java -jar target/statnlp-core-2017.1-SNAPSHOT.jar \
        --linearModelClass org.statnlp.example.linear_crf.LinearCRF \
        --trainPath data/train.data \
        --testPath data/test.data \
        --modelPath data/test.model \
        train test evaluate

The last line above defines the tasks to be executed, in that order.

This package also comes with visualization GUI to see how the graphical models represent the input. Simply execute "visualize" task to the above, as follows:

    java -jar target/statnlp-core-2017.1-SNAPSHOT.jar \
        --linearModelClass org.statnlp.example.linear_crf.LinearCRF \
        --trainPath data/train.data \
        visualize

In the visualization, you can drag the canvas to move around, and you can scroll to zoom in/zoom out. Also the arrow keys (left and right) can be used to show the previous and next instances.

Building New Models

To help understanding the whole package, here we first describe the components of the framework:

  • FeatureManager - The class where you can implement the feature extractor
  • NetworkCompiler - The class where you can implement the graphical model
  • Instance - The data structure to store the input and the corresponding
  • network structures. This is also used to store the gold output and the predictions made by the model during testing.

If your task requires input which is linear (e.g., tokenized sentences), then you can use the built-in LinearInstance<OUT> where OUT is the output type, such as Label in the case of POS tagging, or Tree in constituency parsing. If the input is linear, there is also a built-in FeatureManager than can take in feature templates: TemplateBasedFeatureManager, similar to the feature template used in CRF++.

The main class you may want to implement is then the NetworkCompiler, which is the core part where the graphical models are defined.

This guide will be updated in the future, but for now, you can follow the examples written in the src/main/java/com/statnlp/example directory, with the corresponding main class at src/main/test/com/statnlp/example directory to execute the models.

References

  • Zhanming Jie and Wei Lu, Dependency-based Hybrid Tree for Semantic Parsing, EMNLP 2018.
  • Yanyan Zou and Wei Lu, Learning Cross-lingual Distributed Logical Representations for Semantic Parsing, ACL 2018
  • Hao Li and Wei Lu, Learning with Structured Representations for Negation Scope Extraction, ACL 2018
  • Aldrian Obaja Muis and Wei Lu, Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators, EMNLP 2017
  • Hesam Amoualian et al., Topical Coherence in LDA-based Models through Induced Segmentation, ACL 2017
  • Lim et al., MalwareTextDB: A Database for Annotated Malware Articles, ACL 2017
  • Jie et al., Efficient Dependency-guided Named Entity Recognition, AAAI 2017
  • Hao Li and Wei Lu., Learning Latent Sentiment Scopes for Entity-Level Sentiment Analysis, AAAI 2017
  • Raymond Hendy Susanto and Wei Lu, Semantic Parsing with Neural Hybrid Trees, AAAI 2017
  • Lu et al., A General Regularization Framework for Domain Adaptation, EMNLP 2016
  • Aldrian Obaja Muis and Wei Lu, Learning to Recognize Discontiguous Entities, EMNLP 2016
  • Aldrian Obaja Muis and Wei Lu, Weak Semi-Markov CRFs for NP Chunking in Informal Text, NAACL 2016
  • Wei Lu and Dan Roth, Joint Mention Extraction and Classification with Mention Hypergraphs, EMNLP 2015
  • Wei Lu, Constrained Semantic Forests for Improved Discriminative Semantic Parsing, ACL 2015
  • Wei Lu, Semantic Parsing with Relaxed Hybrid Trees, EMNLP 2014

About


Languages

Language:Java 82.0%Language:Lua 11.3%Language:Prolog 5.1%Language:Perl 1.1%Language:Makefile 0.3%Language:Shell 0.2%Language:Python 0.0%