[Feature request] Possibility to provide false facts to the `KnowledgeGraph` class

Question

[Feature request] Possibility to provide false facts to the `KnowledgeGraph` class

Tazoeur opened this issue 3 years ago · comments

Currently, the KnowledgeGraph class accept a data frame containing three columns (['from', 'rel', 'to']) and I feel like it would be nice to provide some facts that are known to be false, with a fourth column containing a boolean value.

It could be used as a complement to the false facts that are generated through the sampler during the training of a model.
And I think that it would be particularly interesting while using the test kg with the LinkPredictionEvaluator, for which we could provide false facts to analyse the accuracy of the model we are evaluating.

What is your opinion on the subject?

I'll take this opportunity to thank you for the majestic work you've been doing so far!

Armand Boschin · Answer 1 · Fri May 28 2021 00:27:48 GMT+0800 (China Standard Time)

Hi, thanks for the message.

It is indeed quite easy to add false statements as attributes to the KnowledgeGraph class. I think it's easier though to add three attributes corresponding to head, tail and relations of false statements. Keeping those separate from the known true ones asserts that we won't break working objects or use false statements by mistake.

It is not clear to me though what the perks are. I haven't used a KB recording false statements yet. It could help the negative sampling indeed but it is not clear what pairing could be used in the loss functions between true and false statements. This needs to be discussed.

When it comes to link prediction, how do you measure the accuracy of the model using false statements exactly ? I see how it could be used in triplet classification but not link prediction.

Anyways, if such a feature could be useful to you, you can certainly submit an implementation and I'll be available to discuss the details of it with you.

Luis Galárraga · Answer 2 · Mon May 31 2021 16:48:03 GMT+0800 (China Standard Time)

Hi Armand,

Thanks for your prompt reply!

It could help the negative sampling indeed but it is not clear what pairing could be used in the loss functions between true and false statements.

What do you mean by "pairing"?

Datasets such as nell186 come with explicit counter-examples, so I believe that it may be useful for TorchKGE to offer the possibility to inherit negative triples from an external process (in addition to the possibility of implementing such process using the classes and modules available in the library)

When it comes to link prediction, how do you measure the accuracy of the model using false statements exactly ?

I think Guillaume did not mean the accuracy metric as defined in standard classification, but rather in a broader sense. He has probably in mind ranking metrics such as hits@k or the MRR that would allow us to see whether false triples get always lower scores than true triples.

Best,
Luis

Guillaume Latour · Answer 3 · Wed Jun 02 2021 17:04:07 GMT+0800 (China Standard Time)

Hello Armand, Luis,

When it comes to link prediction, how do you measure the accuracy of the model using false statements exactly? I see how it could be used in triplet classification but not link prediction.

I think that I mixed up the two tasks.

I am currently looking for a way to add some known to be false facts to the KnowledgeGraph and I feel like the easiest way would be to give them to a NegativeSampler, which is already responsible for managing false facts.
Or some kind of entity that would be attached to the KnowledgeGraph and used later by the NegativeSampler.

Do you think this would be a good strategy?

I am also implementing another sampling method that Luis came up with (inspired by Rudik[1] if I am not mistaken) and which is close to the PositionalNegativeSampler sampler.
Hoping it will help someone else.

Also, I do not fully understand the differences between the validation set kg_val and the testing set kg_test, could you point me some references that would help me get it?

References

[1]
Ortona, Stefano & Meduri, Venkata & Papotti, Paolo. (2018). RuDiK: rule discovery in knowledge bases. Proceedings of the VLDB Endowment. 11. 1946-1949. 10.14778/3229863.3236231.

Armand Boschin · Answer 4 · Wed Jun 02 2021 22:26:14 GMT+0800 (China Standard Time)

Hi Luis & Guillaume,

What do you mean by "pairing"?

I mean that during training, true triplets and false statements are usually paired in the computation of the loss. Sometimes it is forced like in the Margin loss and sometimes it is indeed a bit artificial like in Sigmoid loss or in BCE loss.

Datasets such as nell186 come with explicit counter-examples

This is indeed a good argument in favor of including such features. It might be good to include a loading function load_nell186 in this file as well. I'll do this myself when the PR is open in order to store the required files on the same server as the other datasets.

I think Guillaume did not mean the accuracy metric as defined in standard classification, but rather in a broader sense.

I get that known false statements can help measure the performance of embedding methods in some ways but I hold that it is not obvious how as almost all metrics I know (Hit@k and MRR included) are based on recovering a fact known to be true. Looking into the scores of false triplets, I think, could result in a useful but new metric. It is important to think about it now because it could incluence how we choose to store this new triplets in TorchKGE to make it easy to plug them into a new evaluation technique.

Armand Boschin · Answer 5 · Wed Jun 02 2021 22:32:53 GMT+0800 (China Standard Time)

I am currently looking for a way to add some known to be false facts to the KnowledgeGraph and I feel like the easiest way would be to give them to a NegativeSampler, which is already responsible for managing false facts.
Or some kind of entity that would be attached to the KnowledgeGraph and used later by the NegativeSampler.

This is a good idea. As I mentioned earlier another way is to simply add new attributes to the KnowledgeGraph class that are by default set to None in the case where no false statements are known. Choosing between the two options depend on what you plan to do with the false statements exactly.

I am also implementing another sampling method that Luis came up with and which is close to the PositionalNegativeSampler sampler. Hoping it will help someone else.

Good !

Also, I do not fully understand the differences between the validation set kg_val and the testing set kg_test, could you point me some references that would help me get it?

Are you familiar with train/validation/test splits in machine learning in general ? This is the same here. The validation set contains samples (here facts) that are used for hyper-parameters tuning and the test set is only used in the end of the training process in order to get final performance on unknown data. See here for more details. Note that in ML in general the split is not fixed but it is however the case for most of KG datasets.

Luis Galárraga · Answer 6 · Thu Jun 17 2021 05:34:36 GMT+0800 (China Standard Time)

Hi Armand & Guillaume,

I get that known false statements can help measure the performance of embedding methods in some ways but I hold that it is not obvious how as almost all metrics I know (Hit@k and MRR included) are based on recovering a fact known to be true. Looking into the scores of false triplets, I think, could result in a useful but new metric. It is important to think about it now because it could incluence how we choose to store this new triplets in TorchKGE to make it easy to plug them into a new evaluation technique.

I see that point. By looking at a few papers I understood that Hits@k and MRR require the generation of counter-examples from true triples. Fixing those triples would result in a different metric as you said. That said, the classical accuracy may benefit from supporting explicit counter-examples. That way we could measure it for triple classification in datasets such as nell186.

Are you familiar with train/validation/test splits in machine learning in general ? This is the same here. The validation set contains samples (here facts) that are used for hyper-parameters tuning and the test set is only used in the end of the training process in order to get final performance on unknown data. See here for more details. Note that in ML in general the split is not fixed but it is however the case for most of KG datasets.

@Tazoeur: Perhaps it may be interesting for us to exploit this TorchKGE feature. In our last meeting we discussed about using the hyper-parameters reported in the papers of the different models. For the cases where those hyper-parameters are not explicitly reported, we could rely on TorchKGE's hyper-parameter optimization.

@armand33: thank you for all your support!