dig-team / amie

Mavenized AMIE+Typing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Duplicates after instantiation

propi opened this issue · comments

Hi,

does AMIE+ some options to determine how to substitute variables by constants?

For example, suppose this kind of rule: (?c p ?b) ^ (?a p ?c) => (?a p ?b).
Can I set that a constant can be bound to just one variable to disable this kind of substitution {?a/A, ?b/B, ?c/B}? In this case we get a path with two duplicate triples mapped to the rule: (B p B) ^ (A p B) => (A p B).

If the KG contains triples (B p B), (A p B) then the rule has support in data and is valid. But, can I disable this kind of instantiation and consider only non-duplicate triples within instantiation during support and confidence computing?

This problem is also for rules with constants. For example, suppose this rule: (?a p C) ^ (?b p ?a) => (?a p ?b) and substitution: {?a/A, ?b/C}, then we get this: (A p C) ^ (C p A) => (A p C). I need to avoid these duplications because it affects support and confidence incorrectly for my use case.

Hi,

As far as I know, AMIE3 has an option to exclude non-injective mappings when counting support. I do not know exactly how to enable it, but @lajus will very likely be able to reply.

Best,
Luis

Hi,

If the KG contains triples (B p B), (A p B) then the rule has support in data and is valid. But, can I disable this kind of instantiation and consider only non-duplicate triples within instantiation during support and confidence computing?

Yes this feature exists but has not been merged into the master branch yet, as we would like to test it further before including it in a release (but it worked as intended on every test we made so far).

In order to get, you will have to pull and compile the gpro branch of this repository:

git pull https://github.com/lajus/amie
cd amie
git checkout gpro
mvn clean install

Then, you can run the generated amie3.jar file with the additional option:
-bias amie.mining.assistant.experimental.InjectiveMappingsAssistant

This problem is also for rules with constants. For example, suppose this rule: (?a p C) ^ (?b p ?a) => (?a p ?b) and substitution: {?a/A, ?b/C}, then we get this: (A p C) ^ (C p A) => (A p C). I need to avoid these duplications because it affects support and confidence incorrectly for my use case.

It should already be the case. When AMIE instantiates such a rule, it should automatically add the atom ?b != C (?b different from C) to the rule (to be verified but it should also add the atom ?a != C in this case). Isn't it the case ?

Regards,
Jonathan

Thank you for the answer.

For the second case the substitution {?a/C, ?b/B} is correct since I get path: (C p C) ^ (B p C) => (C p B), which does not contain duplicates and it is OK. Therefore, the condition ?a != C is not necessary.

Last question. Do you have some benchmarks for DefaultMiningAssistant vs InjectiveMappingsAssistant? The result of the evaluation could be very interesting since there can occur two scenarios:

  1. InjectiveMappingsAssistant will be slower due to duplicates checking.
  2. InjectiveMappingsAssistant will be faster thanks to pruning more rules since the support of some rules will be lower.

Well,

To be clear:

  • injective mappings = two distinct variables cannot be substituted to the same value.
  • injective mappings in practice = two unconnected variables cannot be substituted to the same value.

This is different from "the grounded rule does not contain duplicated atoms".

When it comes to injective mappings, AMIE assumes that there are no reflexive relation in the KB. Otherwise, numerous technical and theoretical issues may arise (cf #33).

Now the addition of the != atom when you allow instantiations is independent from the injective mappings (cf #31). In your particular case, I think it should work.

As for the benchmarks, you can find some in the AMIE 3 paper in Table 5 (injective mappings is iPCA).

Injective mappings is slower than Default, probably because we had to disable the upper bound optimization that can no longer apply in this setting.

Hi.

Regarding the benchmarks, I think it is also valid to consider #42 (and especially check #42 (comment)).

AMIE 3 (and I think AMIE+ too) had a bug where the same operator could be executed multiple times. This did not impact the resulting rules, but had an impact in runtime.

As the bug is cumulative regarding the class inheritance, InjetiveAssistant got most impacted. Injective mappings are still slower than default though.

Note that #42 has not been merged to gpro. May I suggest doing so, @lajus?
Also, it would be interesting to see some updated benchmarks since the bug also affected LazyAssistant/default.