dlwh / epic

**Archived** Epic is a high performance statistical parser written in Scala, along with a framework for building complex structured prediction models.

Home Page:http://scalanlp.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Please make the epic-pos-en model avaliable

mark-watson opened this issue · comments

If you have time, please make the epic-pos-en model available. I have the Treebank data, but it would be easier to not build it myself (and other people might use it if it were available).

actually it is available! version 2015.1.25

Is it possible to get some instructions on how to use this? The following fails on my machine:

java -Xmx4g -cp target/scala-2.11/epic-assembly-0.4-SNAPSHOT.jar epic.parser.ParseText --model ~/Downloads/epic-parser-en-span_2.10-2014.6.3-SNAPSHOT.jar --nthreads 4 /tmp/travel.txt

My suggestion would be to make a "hello world" that used sbt run. Happy to create one if @dlwh were interested!

what's the error you're getting?

On Tue, May 12, 2015 at 9:43 AM, Brian Topping notifications@github.com
wrote:

Is it possible to get some instructions on how to use this? The following
fails on my machine:

java -Xmx4g -cp target/scala-2.11/epic-assembly-0.4-SNAPSHOT.jar
epic.parser.ParseText --model
~/Downloads/epic-parser-en-span_2.10-2014.6.3-SNAPSHOT.jar --nthreads 4
/tmp/travel.txt

My suggestion would be to make a "hello world" that used sbt run. Happy
to create one if @dlwh https://github.com/dlwh were interested!


Reply to this email directly or view it on GitHub
#24 (comment).

Exception in thread "main" java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:344)
    at scala.None$.get(Option.scala:342)
    at epic.parser.ParseText$.classPathLoad(ParseText.scala:21)
    at epic.parser.ParseText$.classPathLoad(ParseText.scala:11)
    at epic.util.ProcessTextMain$class.main(ProcessTextMain.scala:45)
    at epic.parser.ParseText$.main(ParseText.scala:11)
    at epic.parser.ParseText.main(ParseText.scala)

I'm just getting started on my day, gonna take a look as well.

https://github.com/briantopping/epic/blob/master/src/main/scala/epic/models/package.scala#L21-21 is failing, The IOException is getting swallowed by the caller, it is:

java.io.InvalidClassException: epic.parser.models.ParserTrainer$$anonfun$2; local class incompatible: stream classdesc serialVersionUID = 0, local class serialVersionUID = 5531977503861241212
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at epic.models.package$$anonfun$1.applyOrElse(package.scala:21)
        ...

Opened #32.

Yeah, Scala changed serialversionuids for all anonfuns to 0 somewhere in
the 2.11.x release cycle, which has broken all serialized model files,
again.

I have to stop using java serialization. I just don't know what to use
instead.

On Wed, May 13, 2015 at 11:21 PM, Brian Topping notifications@github.com
wrote:

https://github.com/briantopping/epic/blob/master/src/main/scala/epic/models/package.scala#L21-21
is failing, The IOException is getting swallowed by the caller, it is:

java.io.InvalidClassException: epic.parser.models.ParserTrainer$$anonfun$2; local class incompatible: stream classdesc serialVersionUID = 0, local class serialVersionUID = 5531977503861241212
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at epic.models.package$$anonfun$1.applyOrElse(package.scala:21)
at epic.models.package$$anonfun$1.applyOrElse(package.scala:19)
at scala.PartialFunction$$anonfun$runWith$1.apply(PartialFunction.scala:141)
at scala.PartialFunction$$anonfun$runWith$1.apply(PartialFunction.scala:140)
at scala.collection.Iterator$class.foreach(Iterator.scala:743)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1195)
at scala.collection.TraversableOnce$class.collectFirst(TraversableOnce.scala:132)
at scala.collection.AbstractIterator.collectFirst(Iterator.scala:1195)
at epic.models.package$.readFromJar(package.scala:19)
at epic.models.package$.deserialize(package.scala:58)
at epic.models.package$.deserialize(package.scala:15)
at epic.util.ProcessTextMain$class.main(ProcessTextMain.scala:42)
at epic.parser.ParseText$.main(ParseText.scala:11)
at epic.parser.ParseText.main(ParseText.scala)

Gonna work on clearing that up, but does this exception have any meaning
on your end? I'm running on JDK 1.7.0_79-b15.


Reply to this email directly or view it on GitHub
#24 (comment).

Would you entertain a PR that converted to using https://developers.google.com/protocol-buffers ? I'm not at all sure I can pull it off, just a thought so far. Looking around though, I don't see the parser model generators.

Ok, once I rolled back to e0238ce, I was able to get things running with the published models.

It would be great to get a CI process running to generate the models to snapshots on Sonatype. I'd be happy to do this.

the problem is that the data files needed to build the models aren't freely
licensed, so I can't just stick them somewhere.

On Thu, May 14, 2015 at 9:45 PM, Brian Topping notifications@github.com
wrote:

Ok, once I rolled back to e0238ce
e0238ce,
I was able to get things running with the published models.

It would be great to get a CI process running to generate the models to
snapshots on Sonatype. I'd be happy to do this.


Reply to this email directly or view it on GitHub
#24 (comment).

Isn't that a derivative work?

i meant that I can't e.g. put the data on github and have travis ci build
the models. I agree it seems like model files aren't subject to copyright,
but I can't publish an automated rebuild-models-script

On Sat, May 16, 2015 at 11:36 PM, reactormonk notifications@github.com
wrote:

Isn't that a derivative work?


Reply to this email directly or view it on GitHub
#24 (comment).

I have a private CI server (Atlassian Bamboo) in a hardened installation in Equinix NY4. All the training files would remain protected and publish to Sonatype without exposing anything to anyone but yourself. I usually set up TravisCI for OSS projects, but this seems like a good reason to use the private instances.

I switched back to scala 2.10.4 and to epic 0.2 and still having this issue. Actually this lib never worked for me, just trying to start it once a year.