shatu / cogcomp-nlp

CogComp's Natural Language Processing libraries

Home Page:http://deagol.cs.illinois.edu:8080/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CogCompNLP

Build Status Build Status Build status

This project collects a number of core libraries for Natural Language Processing (NLP) developed by Cognitive Computation Group.

CogComp's main NLP libraries

Each library contains detailed readme and instructions on how to use it. In addition the javadoc of the whole project is available here.

Module Description
nlp-pipeline Provides an end-to-end NLP processing application that runs a variety of NLP tools on input text.
core-utilities Provides a set of NLP-friendly data structures and a number of NLP-related utilities that support writing NLP applications, running experiments, etc.
corpusreaders Provides classes to read documents from corpora into core-utilities data structures.
curator Supports use of CogComp NLP Curator, a tool to run NLP applications as services.
edison A library for feature extraction from core-utilities data structures.
lemmatizer An application that uses WordNet and simple rules to find the root forms of words in plain text.
tokenizer An application that identifies sentence and word boundaries in plain text.
transliteration An application that transliterates names between different scripts.
pos An application that identifies the part of speech (e.g. verb + tense, noun + number) of each word in plain text.
ner An application that identifies named entities in plain text according to two different sets of categories.
md An application that identifies entity mentions in plain text.
relation-extraction An application that identifies entity mentions, then identify relation pairs among the mentions detected.
quantifier This tool detects mentions of quantities in the text, as well as normalizes it to a standard form.
inference A suite of unified wrappers to a set optimization libraries, as well as some basic approximate solvers.
depparse An application that identifies the dependency parse tree of a sentence.
verbsense This system addresses the verb sense disambiguation (VSD) problem for English.
prepsrl An application that identifies semantic relations expressed by prepositions and develops statistical learning models for predicting the relations.
commasrl This software extracts relations that commas participate in.
similarity This software compare objects --especially Strings-- and return a score indicating how similar they are.
temporal-normalizer A temporal extractor and normalizer.
dataless-classifier Classifies text into a user-specified label hierarchy from just the textual label descriptions
external-annotators A collection useful external annotators.
  • Questions? Have a look at our FAQs.

Using each library programmatically

To include one of the modules in your Maven project, add the following snippet with the #modulename# and #version entries replaced with the relevant module name and the version listed in this project's pom.xml file. Note that you also add to need the <repository> element for the CogComp maven repository in the <repositories> element.

    <dependencies>
         ...
        <dependency>
            <groupId>edu.illinois.cs.cogcomp</groupId>
            <artifactId>#modulename#</artifactId>
            <version>#version#</version>
        </dependency>
        ...
    </dependencies>
    ...
    <repositories>
        <repository>
            <id>CogCompSoftware</id>
            <name>CogCompSoftware</name>
            <url>http://cogcomp.org/m2repo/</url>
        </repository>
    </repositories>

About

CogComp's Natural Language Processing libraries

http://deagol.cs.illinois.edu:8080/

License:Other


Languages

Language:Java 98.0%Language:Python 1.5%Language:Shell 0.4%Language:Perl 0.1%