Report ESS 0 if khat too large

Question

Report ESS 0 if khat too large

avehtari opened this issue 3 years ago · comments

Example output from https://discourse.mc-stan.org/t/understanding-loo-results-for-non-standard-contaminated-controls-model/25493

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     717   73.2%   151       
 (0.5, 0.7]   (ok)       133   13.6%   116       
   (0.7, 1]   (bad)       74    7.6%   38        
   (1, Inf)   (very bad)  56    5.7%   839       
See help('pareto-k-diagnostic') for details.

where the very bad category shows quite high effective sample size. PSIS paper demonstrates that the effective sample size estimate is unreliable when k>0.7. We are already reporting for MCSE NA if any khat>0.7, so I propose we report effective sample size 0 in these categories. This would reduce possible confusion.

Aki Vehtari · Answer 1 · Sat Jan 15 2022 01:08:41 GMT+0800 (China Standard Time)

Just report 0, or also store in loo object $diagnostics$ess value 0?

Seth Axen · Answer 2 · Sat Jan 15 2022 01:57:24 GMT+0800 (China Standard Time)

Why prefer ESS of 0 over NaN?

Måns Magnusson · Answer 3 · Sun Jan 16 2022 00:24:49 GMT+0800 (China Standard Time)

I agree with Seth, maybe it is more defensive to return NaN. Or is there a specific reason to return 0 here?

Aki Vehtari · Answer 4 · Tue Jan 18 2022 03:07:28 GMT+0800 (China Standard Time)

Why NaN instead of NA? Currently if any k>0.7, we report MCSE as NA

I don't have a strong preference for 0, but I would assume 0 would produce fewer questions. But as we report NA for MCSE, I'm with NA, too

Måns Magnusson · Answer 5 · Wed Jan 19 2022 23:23:37 GMT+0800 (China Standard Time)

Oh. I have no strong feelings on NA vs NaN. If NA is better I think that is good, but I think you said that MCSE returned NaN? I just think NA/NaN is better than 0 since it will cast a more explicit error and hence be more defensive. People would not accidentally use the value.

Maybe we can throw a warning when returning NA to avoid questions?

Aki Vehtari · Answer 6 · Wed Jan 19 2022 23:57:08 GMT+0800 (China Standard Time)

but I think you said that MCSE returned NaN?

Where? In the above comment, I write "Currently if any k>0.7, we report MCSE as NA"

Seth Axen · Answer 7 · Wed Jan 19 2022 23:59:13 GMT+0800 (China Standard Time)

NA seems better, since IIUC, it indicates the value is unknown or unavailable.

Aki Vehtari · Answer 8 · Wed Jan 19 2022 23:59:58 GMT+0800 (China Standard Time)

I just think NA/NaN is better than 0 since it will cast a more explicit error and hence be more defensive. People would not accidentally use the value.

NA (or NaN) is not an error. I'm happy if they use ESS 0, when we know the efficiency is very close to 0.

Maybe we can throw a warning when returning NA to avoid questions?

I think we already have too many warnings thrown that are annoying after the first time. I'm fine adding an explanation in the output if there are k>0.7 as we have planned to do anyway.

Måns Magnusson · Answer 9 · Thu Jan 20 2022 00:59:09 GMT+0800 (China Standard Time)

Sorry. I see ” We are already reporting for MCSE NaN if any khat>0.7, so I propose we report…” in the comment above?

I meant that if people would use the n_eff they would explicitly need to set its value to 0 for k>0.7. Hence it would need an active decision to use it. The current recommendations is to not use the estimates of elpd_i in the first place if k>0.7?

I agree, a warning is not good. I think the current documentation is quite clear. Maybe add a sentance there that the NA can be replaced with zero under the paragraph of effective sample size? If we go with a defensive solution.

Although, this is no big thing for me so just decide what you think is best. Its just me that have a very defensive coding style.

ps. When reading I realized that a ref in the docs is incorrect. I guess Vehtari et al (2017b) should be Vehtari et al (2019).

Aki Vehtari · Answer 10 · Thu Jan 20 2022 01:20:27 GMT+0800 (China Standard Time)

Sorry. I see ” We are already reporting for MCSE NaN if any khat>0.7, so I propose we report…” in the comment above?

Oh, in the first post! I fixed that. The actual printed output is NA.

I meant that if people would use the n_eff

btw. we're switching to use ESS instead of n_eff #192

they would explicitly need to set its value to 0 for k>0.7.

Based on the questions about the current NA, many novice users don't know the meaning of NA. For ESS, 0 might be more informative.

The current recommendations is to not use the estimates of elpd_i in the first place if k>0.7?

The current recommendation is not to use it, if someone's life depend on it (or something similar). In early phases of the workflow you may use it with caution. Of course in the output we need to opt for a short message, and then provide the more cautious explanation.

0: Pros: might be easier to interpret in comparison of other ESS values (ie 0 is smaller than any positive value). Pro or cons: if used to estimate MCSE produces Inf, while we're now reporting NA.
NA: Pros: would follow the current use of NA for MCSE, would mean more of "we don't know" than certain 0.

Måns Magnusson · Answer 11 · Thu Jan 20 2022 02:07:17 GMT+0800 (China Standard Time)

I think this is a good summary. I would add that using NA would probably nudge users to read the Pareteto-k-diagnostic docs with higher probability / be more defensive.

I still lean more toward NA, but I have no strong opinion here.

Måns Magnusson · Answer 12 · Thu Jan 20 2022 02:22:07 GMT+0800 (China Standard Time)

I now fixed the refs in PR #194 as well.