edgararuiz / rsample

Classes and functions to create and summarize resampling objects

Home Page:https://rsample.tidymodels.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rsample

R-CMD-check Codecov test coverage CRAN_Status_Badge Downloads lifecycle

Overview

The rsample package provides functions to create different types of resamples and corresponding classes for their analysis. The goal is to have a modular set of methods that can be used for:

  • resampling for estimating the sampling distribution of a statistic
  • estimating model performance using a holdout set

The scope of rsample is to provide the basic building blocks for creating and analyzing resamples of a data set, but this package does not include code for modeling or calculating statistics. The Working with Resample Sets vignette gives a demonstration of how rsample tools can be used when building models.

Note that resampled data sets created by rsample are directly accessible in a resampling object but do not contain much overhead in memory. Since the original data is not modified, R does not make an automatic copy.

For example, creating 50 bootstraps of a data set does not create an object that is 50-fold larger in memory:

library(rsample)
library(mlbench)

data(LetterRecognition)
lobstr::obj_size(LetterRecognition)
#> 2,644,640 B

set.seed(35222)
boots <- bootstraps(LetterRecognition, times = 50)
lobstr::obj_size(boots)
#> 6,686,512 B

# Object size per resample
lobstr::obj_size(boots)/nrow(boots)
#> 133,730.2 B

# Fold increase is <<< 50
as.numeric(lobstr::obj_size(boots)/lobstr::obj_size(LetterRecognition))
#> [1] 2.528326

Created on 2020-05-07 by the reprex package (v0.3.0)

The memory usage for 50 bootstrap samples is less than 3-fold more than the original data set.

Installation

To install it, use:

install.packages("rsample")

And the development version from GitHub with:

# install.packages("devtools")
install_dev("rsample")

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

  • For questions and discussions about tidymodels packages, modeling, and machine learning, please post on RStudio Community.

  • If you think you have encountered a bug, please submit an issue.

  • Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.

  • We welcome contributions, including typo corrections, bug fixes, and feature requests! If you have never made a pull request to an R package before, rsample is an excellent place to start. Find an issue with the help wanted ❤️ tag, comment that you’d like to take it on, and we’ll help you get started.

  • Check out further details on contributing guidelines for tidymodels packages and how to get help.

About

Classes and functions to create and summarize resampling objects

https://rsample.tidymodels.org

License:Other


Languages

Language:R 100.0%