scalanlp / chalk

Chalk is a natural language processing library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transient NullPointerException in sentence detection

jbnunn opened this issue · comments

When trying sentenceDetector.sentDetect(someText), I sometimes get:

NullPointerException: null (GISModel.java:127)
[error] chalk.learn.model.IndexHashTable.get(IndexHashTable.java:118)
[error] chalk.learn.maxent.GISModel.eval(GISModel.java:127)
[error] chalk.learn.maxent.GISModel.eval(GISModel.java:107)
[error] chalk.learn.maxent.GISModel.eval(GISModel.java:99)
[error] chalk.tools.sentdetect.SentenceDetectorME.sentPosDetect(SentenceDetectorME.java:185)
[error] chalk.tools.sentdetect.SentenceDetectorME.sentDetect(SentenceDetectorME.java:134)
...

If I rerun the test, this error usually immediately goes away. Anyone else experienced this?

I'm not sure what is going on -- everything works fine for me. Can you do the following?

  $ cd /tmp
  $ wget http://opennlp.sourceforge.net/models-1.5/en-sent.bin
  $ cd $CHALK_DIR
  $ ./build
  > console
  scala> import java.io.FileInputStream
  import java.io.FileInputStream

  scala> import chalk.tools.sentdetect._
  import chalk.tools.sentdetect._

  scala> val sdetector = new SentenceDetectorME(new SentenceModel(new FileInputStream("/tmp/en-sent.bin")))
  sdetector: chalk.tools.sentdetect.SentenceDetectorME = chalk.tools.sentdetect.SentenceDetectorME@74dd590f

  scala> val sentences = sdetector.sentDetect("Here is a sentence. Here is another with Mr. Brown in it. Hurrah.")
  sentences: Array[java.lang.String] = Array(Here is a sentence., Here is another with Mr. Brown in it., Hurrah.)

  scala> sentences.foreach(println)
  Here is a sentence.
  Here is another with Mr. Brown in it.
  Hurrah.

Thanks Jason--works like a champ, each time.

So, my guess is I didn't build right the first time--but now after a build, I can't reproduce the error, which is good--sorry for the false alarm :)

I have found myself running into this error occasionally. I will see if I can get a consistent set of steps to reproduce.

This happens when you use SentenceDetectorME in a threaded environment. It is not a thread-safe class.

@fatefree Can only confirm this. I've been running in this issue occasionally and it is a threading problem. Each thread needs its own SentenceDetector!