The R package bigstatsr provides functions for fast statistical analysis of large-scale data encoded as matrices. The package can handle matrices that are too large to fit in memory thanks to memory-mapping to binary files on disk. This is very similar to the format big.matrix
provided by the R package bigmemory, which is no longer used by this package.
Introduction to package bigstatsr
Note that most of the algorithms of this package don't handle missing values.
# For the CRAN version
install.packages("bigstatsr")
# For the current development version
devtools::install_github("privefl/bigstatsr")
# For the first version (depending on package bigmemory)
devtools::install_github("privefl/bigstatsr", ref = "v-bigmemory")
As inputs, package bigstatsr uses Filebacked Big Matrices (FBM).
Please open an issue if you find a bug. If you want help using bigstatsr, please post on Stack Overflow with the tag bigstatsr (not yet created). How to make a great R reproducible example?
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.