tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes

Home Page:https://multidplyr.tidyverse.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implementation of `slice_sample()` for bootstrapping?

raka-everton opened this issue · comments

I was wondering if "slice_sample()" from tidyverse could be implemented in multidplyr please?

I have to bootstrap per group a very large dataset (~20 million observations) and it would be great to implement in this! Otherwise R just keeps crashing when I try to do it in one big batch.

Thanks for the suggestion! Will definitely consider it when I'm next working on multidplyr.