aangelopoulos / conformal-prediction

Lightweight, useful implementation of conformal prediction on real data.

Home Page:http://people.eecs.berkeley.edu/~angelopoulos/blog/posts/gentle-intro/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Score function for APS

szalouk opened this issue · comments

Hello,

Thank you for providing these notebooks for conformal prediction, they have been immensely helpful.

Reading through the section 2.1 of the paper on "Classification with Adaptive Prediction Sets" and the associated notebook, I had some questions about the scoring function.

Namely, the paper provides the score function

$$s(x,y) = \sum_{j=1}^k \hat{f}(x)_{\pi_j(x)}$$

where $y = \pi_j(x)$. Why are we including $\hat{f}(x)_{y}$ in the sum? Doing so would lead to some possibly problematic scores. Consider, for example, a perfect predictor that assigns all of its mass to the correct label $y$, and a completely incorrect predictor that assigns all of its mass to some incorrect label $\ne y$. Both of these predictors would have the same score of 1. This breaks the assumption that a higher score corresponds to misalignment between the forecaster and the true label.

Investigating this issue further, I tried modifying the score function to greedily include all classes up to, but not including, the true label. Intuitively, a higher score would correspond to more probability mass assigned to incorrect labels, which is a better estimate of misalignment. Coding this up in the notebook for APS, this little fix increased the coverage slightly, but more importably it decreased the mean size of the confidence sets to 3.3 (compared to 187.5 in the original notebook). The confidence sets on the imagenet examples also seem to make more sense upon preliminary inspection. This could possibly address an issue raised previously.

Is there a typo/error in the score function of APS that would explain these results?

Thanks in advance!