Specifying CoreNLP jar files location
Xyaneon opened this issue · comments
We are missing some setup information for the CoreNLP jar files. Currently, when I try to run integrated.py
, I get the following backtrace:
************************************************** Loading CoreNLP **************************************************
Traceback (most recent call last):
File "./integrated.py", line 5, in <module>
import test
File "/home/christopher/LTUAssistant/test.py", line 4, in <module>
proc = CoreNLP("parse")
File "/home/christopher/.local/lib/python2.7/site-packages/stanford_corenlp_pywrapper/sockwrap.py", line 139, in __init__
assert any(os.path.exists(f) for f in deglobbed), "CoreNLP jar files don't seem to exist; are the paths correct? Searched files: %s" % repr(deglobbed)
AssertionError: CoreNLP jar files don't seem to exist; are the paths correct? Searched files: <itertools.chain object at 0x7fa7b4d45c50>
Using information from here for the stanford_corenlp_pywrapper module, it seems the issue is that the CoreNLP package must be downloaded and extracted first, then have its path specified to stanford_corenlp_pywrapper() like so:
proc = CoreNLP("parse", corenlp_jars=["/home/christopher/Downloads/stanford-corenlp-full-2015-04-20/*"])
Unfortunately, this in return requires knowing ahead of time where the folder is on that user's machine.
We should either:
- Specify in the README where the user should have this folder located at
- Include a setup script in the top-level project directory which will download and set up CoreNLP for the user in the proper place
Actually, another option is to simply include the CoreNLP files in the top-level directory and change line 4 of test.py
to:
proc = CoreNLP("parse", corenlp_jars=["stanford-corenlp-full-2015-04-20/*"])
This works for me, but could take up a lot more space (the original downloaded .zip archive for CoreNLP was about 344MB).