The issues of calculating precision@K

Question

The issues of calculating precision@K

AGTSAAA opened this issue a year ago · comments

The way you calculate the precision@K may be wrong since you should divide the K at the last line instead of the num_gd. I also want to ask what are the meanings of C and E.

num_gd = int(ground_truth_mask[C: C + E].sum())
pred = pred_weight[C:C + E]
_, indices_for_sort = pred.sort(descending=True, dim=-1)
idx = indices_for_sort[:num_gd].detach().cpu().numpy()
precision.append(ground_truth_mask[C: C + E][idx].sum().float()/num_gd)](url)

Thank you very much!

AGTSAAA · Answer 1 · Thu Aug 31 2023 10:09:07 GMT+0800 (China Standard Time)

Precision at K at https://www.kaggle.com/code/debarshichanda/understanding-mean-average-precision

Shirley Wu · Answer 2 · Thu Aug 31 2023 13:51:22 GMT+0800 (China Standard Time)

Hi,

C is the start of a graph's edge index and E is the number of edges in this graph Here basically we are splitting a graph from a batch to compute the metric.

Detailed explanation:

ground_truth_mask[C: C + E] is the ground truth causal mask.
=> (ground_truth_mask[C: C + E][idx] is the mask predicted by DIR.

hits = sum((ground_truth_mask[C: C + E][idx])
total = sum(ground_truth_mask[C: C + E])

precision = #hits / total

We don't need to divide K since here we have K=5 which is the number of edges in a circle motif. But you are right, it is more appreciate to just call it Precision since crane and house motifs have 6 edges which is not exactly 5, or change num_gd to 5. Either way, we keep the evaluation the same for all the baselines so the conclusion should be the same. I will correct it in the code.

Thanks for pointing that out!

AGTSAAA · Answer 3 · Fri Sep 01 2023 02:55:22 GMT+0800 (China Standard Time)

Thank you very much for your reply! I believe what you originally did is actually calculating Recall not the Precision.

Shirley Wu · Answer 4 · Fri Sep 01 2023 03:20:44 GMT+0800 (China Standard Time)

Note that we set number of causal edges predicted by DIR the same as the number of ground truth, thus *FN = FP.

Example:

ground truth mask = [0, 0, 1, 1, 1]

predicted mask = [1, 1 ,1, 0, 0]

TP = 1
FN = FP = 2
TN = 0

precision = recall = 1 / 3

AGTSAAA · Answer 5 · Fri Sep 01 2023 04:48:51 GMT+0800 (China Standard Time)

Yes. Precision = recall holds if and only if we set the number of causal edges (Top K) predicted by DIR the same as the number of ground truth. But the number of ground truths varies for each sample, so you can not call it Precision since crane and house motifs have 6 edges, which is not exactly 5, i.e.,

the number of idx = indices_for_sort[:num_gd].detach().cpu().numpy()

varies which instead should be the same $K$ for every prediction lists for Precision calculation). The better way is to change num_gd in both prediction and ground truth lists to 5 if you want to call it Precision.

Shirley Wu · Answer 6 · Fri Sep 01 2023 05:30:35 GMT+0800 (China Standard Time)

I see these are two individual problems that you mentioned.

Varying number of ground truth doesn’t affect the fact that it’s the average Precision based on the definition (as long as #gd=#selected Recall or Precision doesn’t matter so I don’t see it as a problem), but does affect that it’s being called as Precision @ K.

In conclusion, the argument is not between Recall and Precision but between Precision and Precision @ K. And the correct definition is Precision imo. Does it make sense to you?

AGTSAAA · Answer 7 · Fri Sep 01 2023 06:06:19 GMT+0800 (China Standard Time)

Yes. It is just not Precision@5 in your paper.