Question about the PR curves of the GIDS dataset

Question

Question about the PR curves of the GIDS dataset

YaNjIeE opened this issue 5 years ago · comments

Hi,
I wonder to know if the way to obtain the PR curves of the GIDS dataset is as same as the NYT dataset, where we only count the non-NA labels?
And How many data point in the test set you draw in PR-curves?

BTW, I notice that GIDS dataset has dev set. I'd like to know if you use it? Or just use the train set and test set?

Looking forward to your reply.

Best.

ReDsUn commented 5 years ago

@svjan5

Shikhar Vashishth · Answer 1 · Sun Jan 26 2020 01:17:03 GMT+0800 (China Standard Time)

Hi,
Yes for drawing PR curves for GIDS dataset, we followed the exact same procedure which we used for NYT.
Unlike NYT, GIDS comes with a dev set.

ReDsUn · Answer 2 · Tue Jan 28 2020 22:43:21 GMT+0800 (China Standard Time)

OK， thanks a lot.
I want to ask anther question: what is the probY in the dataset?
In READMe, you said it's relation alias, but I am confused about that?
Would you please explain it for me?
Thanks a lot.

Best.

ReDsUn · Answer 3 · Wed Jan 29 2020 00:02:40 GMT+0800 (China Standard Time)

In other words, what does the index in 'probY' represent for each sentence?

Shikhar Vashishth · Answer 4 · Wed Jan 29 2020 00:04:12 GMT+0800 (China Standard Time)

Through this pipeline: https://github.com/malllabiisc/RESIDE/blob/master/images/relation_alias.png
We get a probability distribution over relations for each sentence in the dataset
which is used as side information. That's what ProbY means.