Question about the PR curves of the GIDS dataset
YaNjIeE opened this issue · comments
Hi,
I wonder to know if the way to obtain the PR curves of the GIDS dataset is as same as the NYT dataset, where we only count the non-NA labels?
And How many data point in the test set you draw in PR-curves?
BTW, I notice that GIDS dataset has dev set. I'd like to know if you use it? Or just use the train set and test set?
Looking forward to your reply.
Best.
Hi,
Yes for drawing PR curves for GIDS dataset, we followed the exact same procedure which we used for NYT.
Unlike NYT, GIDS comes with a dev set.
OK, thanks a lot.
I want to ask anther question: what is the probY in the dataset?
In READMe, you said it's relation alias, but I am confused about that?
Would you please explain it for me?
Thanks a lot.
Best.
In other words, what does the index in 'probY' represent for each sentence?
Through this pipeline: https://github.com/malllabiisc/RESIDE/blob/master/images/relation_alias.png
We get a probability distribution over relations for each sentence in the dataset
which is used as side information. That's what ProbY means.