justhalf / weak-semi-crf-naacl2016

The code for Weak Semi CRF (together with Linear CRF and Semi CRF) on new SMSNP dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

By: Aldrian Obaja
Date: 29 Mar 2016
====================
Description of files
====================

** Essential files **
src/ - The main source folder. The models implementations are here
lib/ - The folder containing all the required libraries
experiments/ - The folder containing the main experiment scripts
data/ - The dataset used in the paper

pom.xml - The Maven pom file

create_brown_clusters.bash - To create Brown cluster information
                             Note that this requires the brown cluster binary
                             wcluster to be present in the folder brown-cluster/
                             The Brown clustering software can be downloaded at
                             https://github.com/percyliang/brown-cluster
create_tokenized_dataset.bash - To create SMSNP.conll.* files
get_span_histogram.bash - To print statistics on span and token length

** Generated files **
target/ - The binary files generated by Maven
dependency-reduced-pom.xml - The file generated by Maven
63-c100-p1.out/ - The Brown clusters information generated by
                  "create_brown_clusters.bash"

==============
How to compile
==============
1. Get Maven (https://maven.apache.org)
2. Run `mvn package` on this folder
   This step will produce "target/experiments-smsnp-1.0-SNAPSHOT.jar"
3. The jar file can be executed with:
       java -jar target/experiments-smsnp-1.0-SNAPSHOT.jar
   See the scripts under experiments/ to see how it was used

To edit the source code properly, open as Maven project.
For example, in Eclipse, after opening this folder as normal Java project,
right-click on the project and click Configure -> Convert to Maven Project

About

The code for Weak Semi CRF (together with Linear CRF and Semi CRF) on new SMSNP dataset.

License:GNU General Public License v3.0


Languages

Language:Java 86.2%Language:Python 7.1%Language:Shell 4.9%Language:Perl 1.9%