stan-dev / loo

loo R package for approximate leave-one-out cross-validation (LOO-CV) and Pareto smoothed importance sampling (PSIS)

Home Page:https://mc-stan.org/loo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

E_loo fails with github version of posterior

avehtari opened this issue · comments

The latest loo release uses posterior::pareto_khat() as in the latest posterior CRAN release. The github version of posterior has breaking changes. I can fix this tomorrow, but creating the issue now. Ping @n-kall, too

All tests for loo v2.6.0 seem to pass fine with the github version of posterior installed. What specifically is failing?

Edit: Meant to check loo v2.7.0,but they pass also

loo v2.6.0 test output

> devtools::test()
ℹ Testing loo

Attaching package:testthatThe following object is masked frompackage:loo:

    compare

This is loo version 2.6.0
- Online documentation and vignettes at mc-stan.org/loo
- As of v2.0.0 loo defaults to 1 core but we recommend using as many as possible. Use the 'cores' argument or set options(mc.cores = NUM_CORES) for an entire session.| F W  S  OK | Context|         20 | helper functions and example data|         51 | compare models|         18 | crps|         72 | Depracted extractors|         58 | E_loo|          2 | extract_log_lik|          5 | generalized pareto|         31 | kfold helper functions|         54 | loo, waic and elpd|         25 | loo_approximate_posterior|         22 | moment matching|         51 | loo_predictive_metric|        161 | loo_subsampling [8.0s]
✔ |         54 | loo_subsampling_approximations [1.1s]
✔ |         34 | loo_subsampling_estimation|        115 | loo_subsampling cases [43.9s]
✔ |         34 | loo_compare_subsample [2.4s]
✔ |         20 | subsample with tis, sis [17.2s]
✔ |         35 | loo_model_weights|         60 | print, plot, diagnostics [16.4s]
✔ |         64 | psis_approximate_posterior|         42 | psis|         15 | psislw|          5 | relative_eff methods|         58 | tis and is

══ Results ══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Duration: 92.4 s

[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1106 ]
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_DK.utf8        LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=en_GB.utf8    LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=fi_FI.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C      

time zone: Europe/Helsinki
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] loo_2.6.0      testthat_3.2.1

loaded via a namespace (and not attached):
 [1] tensorA_0.36.2.1     utf8_1.2.4           generics_0.1.3      
 [4] stringi_1.8.3        digest_0.6.34        magrittr_2.0.3      
 [7] pkgload_1.3.4        fastmap_1.1.1        rprojroot_2.0.4     
[10] pkgbuild_1.4.3       sessioninfo_1.2.2    brio_1.1.4          
[13] backports_1.4.1      urlchecker_1.0.1     promises_1.2.1      
[16] purrr_1.0.2          fansi_1.0.6          abind_1.4-5         
[19] cli_3.6.2            shiny_1.8.0          rlang_1.1.3         
[22] ellipsis_0.3.2       remotes_2.4.2.1      withr_3.0.0         
[25] cachem_1.0.8         devtools_2.4.5       tools_4.3.2         
[28] parallel_4.3.2       memoise_2.0.1        checkmate_2.3.1     
[31] httpuv_1.6.13        vctrs_0.6.5          posterior_1.5.0.9000
[34] R6_2.5.1             mime_0.12            matrixStats_1.2.0   
[37] lifecycle_1.0.4      stringr_1.5.1        fs_1.6.3            
[40] htmlwidgets_1.6.4    usethis_2.2.3        miniUI_0.1.1.1      
[43] waldo_0.5.2          desc_1.4.3           pkgconfig_2.0.3     
[46] pillar_1.9.0         later_1.3.2          glue_1.7.0          
[49] profvis_0.3.8        Rcpp_1.0.12          tibble_3.2.1        
[52] rstudioapi_0.15.0    xtable_1.8-4         htmltools_0.5.7     
[55] compiler_4.3.2       distributional_0.4.0
> 

loo github version (v2.7.0.9000, but also v2.7.0) test output:

> devtools::test()
ℹ Testing loo
This is loo version 2.7.0.9000
- Online documentation and vignettes at mc-stan.org/loo
- As of v2.0.0 loo defaults to 1 core but we recommend using as many as possible. Use the 'cores' argument or set options(mc.cores = NUM_CORES) for an entire session.| F W  S  OK | Context|         20 | helper functions and example data|         53 | compare models|         18 | crps|         72 | Depracted extractors|         73 | E_loo|          2 | extract_log_lik|          5 | generalized pareto|         31 | kfold helper functions|         54 | loo, waic and elpd|         25 | loo_approximate_posterior|         23 | moment matching|         51 | loo_predictive_metric|        161 | loo_subsampling [7.7s]
✔ |         54 | loo_subsampling_approximations [1.2s]
✔ |         34 | loo_subsampling_estimation|        121 | loo_subsampling cases [24.6s]
✔ |         34 | loo_compare_subsample [2.3s]
✔ |         20 | subsample with tis, sis [1.9s]
✔ |         35 | loo_model_weights|          9 | pointwise convenience function|         60 | print, plot, diagnostics [14.7s]
✔ |         64 | psis_approximate_posterior|         47 | psis|         15 | psislw|          5 | relative_eff methods|         63 | tis and is

══ Results ══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Duration: 56.7 s

[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1149 ]
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_DK.utf8        LC_COLLATE=en_GB.utf8    
 [5] LC_MONETARY=en_GB.utf8    LC_MESSAGES=en_GB.utf8   
 [7] LC_PAPER=fi_FI.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C      

time zone: Europe/Helsinki
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] loo_2.7.0.9000 testthat_3.2.1

loaded via a namespace (and not attached):
 [1] tensorA_0.36.2.1     generics_0.1.3       utf8_1.2.4          
 [4] stringi_1.8.3        digest_0.6.34        magrittr_2.0.3      
 [7] pkgload_1.3.4        fastmap_1.1.1        rprojroot_2.0.4     
[10] pkgbuild_1.4.3       sessioninfo_1.2.2    backports_1.4.1     
[13] brio_1.1.4           urlchecker_1.0.1     promises_1.2.1      
[16] purrr_1.0.2          fansi_1.0.6          abind_1.4-5         
[19] cli_3.6.2            shiny_1.8.0          rlang_1.1.3         
[22] ellipsis_0.3.2       remotes_2.4.2.1      withr_3.0.0         
[25] cachem_1.0.8         devtools_2.4.5       tools_4.3.2         
[28] parallel_4.3.2       memoise_2.0.1        checkmate_2.3.1     
[31] httpuv_1.6.13        posterior_1.5.0.9000 vctrs_0.6.5         
[34] R6_2.5.1             mime_0.12            matrixStats_1.2.0   
[37] lifecycle_1.0.4      stringr_1.5.1        fs_1.6.3            
[40] htmlwidgets_1.6.4    usethis_2.2.3        miniUI_0.1.1.1      
[43] waldo_0.5.2          pkgconfig_2.0.3      desc_1.4.3          
[46] pillar_1.9.0         later_1.3.2          glue_1.7.0          
[49] profvis_0.3.8        Rcpp_1.0.12          tibble_3.2.1        
[52] rstudioapi_0.15.0    xtable_1.8-4         htmltools_0.5.7     
[55] compiler_4.3.2       distributional_0.4.0

I encountered this in Nabiximols study, but can reproduce also with

fit1 <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = negbinomial());
fit1 <- add_criterion(fit1, criterion='loo', save_psis=TRUE)
E_loo(0+(posterior_predict(fit1)>20),loo(fit1)$psis_object)$value

One problem is that when pareto_khat(..., tail="both") sees constant tail, max(left_k, right_k) gives NA. It would probably be better to use k <- max(left_k, right_k, na.rm=TRUE)