lawwu / tidyposterior

Bayesian comparisons of models using resampled statistics

Home Page:https://topepo.github.io/tidyposterior

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tidyposterior

Travis build status Coverage status CRAN_Status_Badge Downloads

This package can be used to conduct post hoc analyses of resampling results generated by models.

For example, if two models are evaluated with the root mean squared error (RMSE) using 10-fold cross-validation, there are 10 paired statistics. These can be used to make comparisons between models without involving a test set.

There is a rich literature on the analysis of model resampling results such as McLachlan's Discriminant Analysis and Statistical Pattern Recognition and the references therein. This package follows the spirit of Benavoli et al (2017).

tidyposterior uses Bayesian generalized linear models for this purpose and can be considered an upgraded version of the caret::resamples function. The package works with rsample objects natively but any results in a data frame can be used.

Installation

You can install tidyposterior from github with:

# install.packages("devtools")
devtools::install_github("topepo/tidyposterior")

Example

library(tidyposterior)
# See ? precise_example
data(precise_example)

# Get classification accuracy results for analysis

library(dplyr)
accuracy <- precise_example %>%
   select(id, contains("Accuracy")) %>%
   setNames(tolower(gsub("_Accuracy$", "", names(.)))) 
accuracy

# Model the accuracy results
acc_model <- perf_mod(accuracy, seed = 13311, verbose = FALSE)   

# Extract posterior distributions:
accuracy_dists <- tidy(acc_model)

# Credible intervals for accuracy per model
summary(accuracy_dists)

About

Bayesian comparisons of models using resampled statistics

https://topepo.github.io/tidyposterior


Languages

Language:R 95.7%Language:CSS 4.3%