Training a Parser: Error while Indexing BinaryRule
markfarrell opened this issue · comments
Hi,
Experiencing problems using the parser trainer. My goal is to train a parser with CRAFT's treebank for the biology domain; not sure if the layout of CRAFT's treebank is supported by epic or not. Tried to start out by training a parser on "smallbank" -- no luck. Any advice?
$ java -cp target/scala-2.11/epic-assembly-0.2.jar epic.parser.models.ParserTrainer \
--treebankType simple
--treebank.path "src/main/resources/smallbank"
--modelFactory epic.parser.models.SpanModelFactory
--cache.path constraints.cache
--opt.useStochastic true
--opt.regularization 1.0
[main] INFO epic.parser.models.ParserTrainer$ - Training Parser...
Exception in thread "main" java.lang.RuntimeException:
error while indexingBinaryRule(VP[^SINV], VBN[^VP], PP[^VP]) to
BinaryRule(VP, VBN, PP)0
at epic.parser.projections.ProjectionIndexer$$anonfun$apply$5.apply(ProjectionIndexer.scala:115)
...
probably you ran it on a small corpus first? delete xbar.gr
I should get rid of the need for that file, but that should do it.
On Wed, Oct 8, 2014 at 12:13 AM, Mark Farrell notifications@github.com
wrote:
Hi,
Experiencing problems using the parser trainer. My goal is to train a
parser with CRAFT's treebank for the biology domain; not sure if the layout
of CRAFT's treebank is supported by epic or not. Tried to start out by
training a parser on "smallbank" -- no luck. Any advice?$ java -cp target/scala-2.11/epic-assembly-0.2.jar epic.parser.models.ParserTrainer
--treebankType simple
--treebank.path "src/main/resources/smallbank"
--modelFactory epic.parser.models.SpanModelFactory
--cache.path constraints.cache
--opt.useStochastic true
--opt.regularization 1.0
[main] INFO epic.parser.models.ParserTrainer$ - Training Parser...
Exception in thread "main" java.lang.RuntimeException:
error while indexingBinaryRule(VP[^SINV], VBN[^VP], PP[^VP]) to
BinaryRule(VP, VBN, PP)0
at epic.parser.projections.ProjectionIndexer$$anonfun$apply$5.apply(ProjectionIndexer.scala:115)
...—
Reply to this email directly or view it on GitHub
#16.
Oh, that seems to have worked. I can train the parser on "smallbank". Do you think it will be much work to train the parser on the CRAFT corpus? I believe it contains a collection of biology articles and a list of parsed sentences for each article.