Training a Parser: Error while Indexing BinaryRule

Question

Training a Parser: Error while Indexing BinaryRule

markfarrell opened this issue 10 years ago · comments

Hi,

Experiencing problems using the parser trainer. My goal is to train a parser with CRAFT's treebank for the biology domain; not sure if the layout of CRAFT's treebank is supported by epic or not. Tried to start out by training a parser on "smallbank" -- no luck. Any advice?

$ java -cp target/scala-2.11/epic-assembly-0.2.jar epic.parser.models.ParserTrainer \
   --treebankType simple
   --treebank.path "src/main/resources/smallbank"
   --modelFactory epic.parser.models.SpanModelFactory
   --cache.path constraints.cache
   --opt.useStochastic true 
   --opt.regularization 1.0
[main] INFO epic.parser.models.ParserTrainer$ - Training Parser...
Exception in thread "main" java.lang.RuntimeException: 
error while indexingBinaryRule(VP[^SINV], VBN[^VP], PP[^VP]) to 
BinaryRule(VP, VBN, PP)0
at epic.parser.projections.ProjectionIndexer$$anonfun$apply$5.apply(ProjectionIndexer.scala:115)
...

David Hall · Answer 1 · Sun Oct 12 2014 03:03:43 GMT+0800 (China Standard Time)

probably you ran it on a small corpus first? delete xbar.gr

I should get rid of the need for that file, but that should do it.

On Wed, Oct 8, 2014 at 12:13 AM, Mark Farrell notifications@github.com
wrote:

Hi,

Experiencing problems using the parser trainer. My goal is to train a
parser with CRAFT's treebank for the biology domain; not sure if the layout
of CRAFT's treebank is supported by epic or not. Tried to start out by
training a parser on "smallbank" -- no luck. Any advice?

$ java -cp target/scala-2.11/epic-assembly-0.2.jar epic.parser.models.ParserTrainer
--treebankType simple
--treebank.path "src/main/resources/smallbank"
--modelFactory epic.parser.models.SpanModelFactory
--cache.path constraints.cache
--opt.useStochastic true
--opt.regularization 1.0
[main] INFO epic.parser.models.ParserTrainer$ - Training Parser...
Exception in thread "main" java.lang.RuntimeException:
error while indexingBinaryRule(VP[^SINV], VBN[^VP], PP[^VP]) to
BinaryRule(VP, VBN, PP)0
at epic.parser.projections.ProjectionIndexer$$anonfun$apply$5.apply(ProjectionIndexer.scala:115)
...

—
Reply to this email directly or view it on GitHub
#16.

Mark Farrell · Answer 2 · Sun Oct 12 2014 03:39:04 GMT+0800 (China Standard Time)

Oh, that seems to have worked. I can train the parser on "smallbank". Do you think it will be much work to train the parser on the CRAFT corpus? I believe it contains a collection of biology articles and a list of parsed sentences for each article.