opencog / relex

English Dependency Relationship Extractor

Home Page:http://wiki.opencog.org/w/RelEx

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

failing tests : miscomparision

dagiopia opened this issue · comments

many of the tests are failing with this type of miscomparision
Ubuntu: 18.04 with OpenJDK 11.0.4 and Link Grammar 5.7.0

Error: content miscompare:
	Expected = [_poss(design, him), _predadj(design, bad), _quantity(design, all)]
	Got Binary Relations = [_poss(design, his), _predadj(design, bad), _quantity(design, all)]
	Got Unary Relations = [definite-FLAG(design, T), gender(his, masculine), noun_number(all, plural), noun_number(design, plural), pos(., punctuation), pos(all, adj), pos(bad, adj), pos(be, verb), pos(design, noun), pos(his, noun), possessive-FLAG(his, T), pronoun-FLAG(his, T), tense(bad, present)]
	Sentence = All his designs are bad.

full error message here

Yes, this is the case, and it's been like that for a very long time. (more than 5 years?)

These are all "fixable", in that the rules are specified in 2-3 files, and once you get a general feeling for how the rules work, it is not that hard to fix them up. Unfortunately, it takes a while to understand how the rules work, and fixing them is tedious. Constructing the grammar of language by hand is hard work.

For this reason, most energy these days goes into developing algorithms that can learn these rules automatically. Of course, these algorithms don't work all that well, and developing them is hard. :-/

Was using ant for building and I never run relex tests. The migration to maven was the change that exposed them for me.
Okay, closing since it's known.
Thank you!

I mean, you can keep this open. We have two choices: someone should either fix relex, or we should decrease and remove dependency on it. The problem, though is that ghost depends on R2L and R2L depends on relex ... its stove-piped.

yeah, right! keeping this open would encourage a fix
which one are you leaning towards? any plans?

I just realized that relex unit tests doesn't have asserts actually. I would propose dividing them on two sets:

  • add asserts to passed unit tests to guarantee they stay green;
  • mark failed tests to add asserts when issue is fixed.

On reason that the URE (unified rule engine) was developed for opencog was so that the relex rules could run on it (that is what made it "unified"). In the end, the relex rules were never actually ported to the URE. But they should be. Or, at least someone should try porting a handful of them to see how hard it is, and see what issues pop up.

Also: most or all relex rules to identify the head-verb are no longer needed: LG was changed to do this automatically. So a lot of simplification of the relex ruleset is possible. Much or most of the complexity in the relex rules was to identify the head-verb. If this is removed, most of the remaining rules are really very simple, almost trivial.