Additional Wikidata triples
JanKalo opened this issue · comments
Hi there,
we are interested in working with your training data set for fact extraction.
In the paper you mention that TREX does not contain 1000 triples for all properties, so you add extra triples from Wikidata. However, I cannot find these triples in the .jsonl files. Some of the properties actually don't have 1000 triples.
Am I missing something?
It would be nice, if you could clarify how I can find these additional triples or whether you did not use them after all in the training.
Bests,
Jan
Hi Jan, thanks for your interest in our paper! The extra triples from Wikidata are in the train.jsonl and val.jsonl files for each relation. Unfortunately, some of the relations had very few triples in general so relations like P1376 and P108 will have fewer than 1000 data points. We tried our best to collect at most 1000 triples for the rest of the relations though.
Oh, you are right. I was somehow expecting more relations to have 1000 data points.
Thanks.