THU-KEG / MAVEN-ERE

Source code and dataset for EMNLP 2022 paper "MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MATRES Evaluation

xinsu626 opened this issue · comments

commented

Hi, thanks for sharing the data! I had a question regarding the performance evaluation of temporal relation extraction. I noticed that you used the "--ignore_nonetype" flag when evaluating the model on MATRES. If I understand correctly, this means that the model only predicts the relationship of event pairs that must have a temporal relation and only evaluates those pairs. Is this a convention? Or should I disable the ignore_nonetype flag in order to evaluate all event pair relationships?

Hi, thanks for your interest in our work. The --ignore_nonetype flag does not mean totally discarding the negative samples (event pairs without temporal relations) from evaluation. It is for excluding the negative samples from loss calculation (see here), which is a common practice in unbalanced multi-class classification. The intuition behind here is that if we take negative samples into loss calculation, the gradient will be dominated by them since the negative samples are much more than the positive samples.

commented

@Bakser Thanks for the prompt reply. I got the training part, but it seems you also excluded those -100 labels during the evaluation (https://github.com/THU-KEG/MAVEN-ERE/blob/main/temporal/main_other.py#L48).

And I just found the evaluation method described in Ning et al. 2019 Appendix A that they only consider the event pairs with before, after, and equal relations for evaluation. So I guess you are following the same evaluation schema.

That's right. Correctly classifying negative samples does not accounts for evaluation metrics. Sorry for missing that previously.

commented

That's right. Correctly classifying negative samples does not accounts for evaluation metrics. Sorry for missing that previously.

Got it. Thank you!