NLPbox / stanford-corenlp-docker

build/run the most current Stanford CoreNLP server in a docker container

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

stanford-corenlp-docker

This Dockerfile will build and run the most current release of the Stanford CoreNLP server in a docker container.

Usage

To download and run a prebuilt version of the CoreNLP server from Docker Hub locally at http://localhost:9000, just type:

docker run -p 9000:9000 nlpbox/corenlp

By default, CoreNLP will use up to 4GB of RAM. You can change this by setting the JAVA_XMX environment variable. Here, we're giving it 3GB:

docker run -e JAVA_XMX=3g -p 9000:9000 -ti nlpbox/corenlp

In order to build and run the container from scratch (e.g. if you want to use the most current release of Stanford CoreNLP, type:

docker buildx build -t corenlp https://github.com/NLPbox/stanford-corenlp-docker.git
docker run -p 9000:9000 corenlp

In another console, you can now query the CoreNLP REST API like this:

wget -q --post-data "Although they didn't like it, they accepted the offer." \
  'localhost:9000/?properties={"annotators":"parse","outputFormat":"json"}' \
  -O - | jq ".sentences[0].parse"

which will return this parse tree:

"(ROOT\n  (S\n    (SBAR (IN Although)\n      (S\n        (NP (PRP they))\n        (VP (VBD did) (RB n't)\n          (PP (IN like)\n            (NP (PRP it))))))\n    (, ,)\n    (NP (PRP they))\n    (VP (VBD accepted)\n      (NP (DT the) (NN offer)))\n    (. .)))"

If you need the full xml output and want to configure more parameters, try:

wget -q --post-data "Although they didn't like it, they accepted the offer." \
  'localhost:9000/?properties={ \
    "annotators":"tokenize,ssplit,pos,lemma,ner,parse", \
    "ssplit.eolonly":"false", "tokenize.whitespace":"true", \
    "outputFormat":"xml"}' \
  -O results.xml

About

build/run the most current Stanford CoreNLP server in a docker container


Languages

Language:Python 50.7%Language:Dockerfile 49.3%