kermitt2 / Wapiti

A simple and fast discriminative sequence labeling toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is a modified version of Wapiti 1.5

This version was originally patched by Vyacheslav Zholudev.

The present modified version brings to the latest Wapiti release the following features:

  • SWIG mapping providing in particular JNI interface for integrating Wapiti in Java, both for training and decoding,
  • CMake, cross-platform build system, instead of Make, allowing to build the library more easily in the environment of your choice.

The source code was patched not to use VLA (i.e. main(int argc, char*[argc] argv). Instead, main(int argc, char** argv) is used). In addition, some modifications have been made to produce binary models independent from the environment LOCALE, to avoid rare JVM crash when used in JNI and to prioritize if necessary local library linking (the Wapiti library becomes then "portable" on different Linux distributions and version).

To build the code, from the Wapiti root path:

mkdir build; 
cd build; 
cmake ..; 
make 

libwapiti.so or libwapiti.dylib will appear under the same subdirectory build.

The jar file can we found under src/swig.


Wapiti - A linear-chain CRF tool

Copyright (c) 2009-2013  CNRS
All rights reserved.

For more detailed information see the homepage.

Wapiti is a very fast toolkit for segmenting and labeling sequences with discriminative models. It is based on maxent models, maximum entropy Markov models and linear-chain CRF and proposes various optimization and regularization methods to improve both the computational complexity and the prediction performance of standard models. Wapiti is ranked first on the sequence tagging task for more than a year on MLcomp web site.

Wapiti is developed by LIMSI-CNRS and was partially funded by ANR projects CroTaL (ANR-07-MDCO-003) and MGA (ANR-07-BLAN-0311-02).

For suggestions, comments, or patchs, you can contact me at lavergne@limsi.fr

If you use Wapiti for research purpose, please use the following citation:

@inproceedings{lavergne2010practical,
    author    = {Lavergne, Thomas and Capp\'{e}, Olivier and Yvon,
                 Fran\c{c}ois},
    title     = {Practical Very Large Scale {CRFs}},
    booktitle = {Proceedings the 48th Annual Meeting of the Association
                 for Computational Linguistics ({ACL})},
    month     = {July},
    year      = {2010},
    location  = {Uppsala, Sweden},
    publisher = {Association for Computational Linguistics},
    pages     = {504--513},
    url       = {http://www.aclweb.org/anthology/P10-1052}
}

About

A simple and fast discriminative sequence labeling toolkit

License:Other


Languages

Language:C 78.1%Language:C++ 13.5%Language:Java 5.1%Language:SWIG 1.7%Language:Roff 0.9%Language:CMake 0.3%Language:Shell 0.3%Language:Makefile 0.2%