RBigData / pbdIO

An interface to parallel input output packages with a focus on Single Program/Multiple Data ('SPMD') parallel programming style, which is intended for batch parallel execution.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pbdIO

An interface to parallel input output packages aimed at cluster computers and intended for batch parallel MPI-based execution. (csv implemented, ADIOS2 (see "RBigData/hola") and hdf5 planned)

csv files: An MPI distributed parallel reading of multiple csv files in a directory. Each rank reads different files to produce a local data.table/data.frame via data.table's fread function. Use package pbdMPI's functions to do further global computations on the distributed data.

ADIOS2 files: coming soon (also see RBigData hola package)

HDF5 files: coming soon

Installation

The package is maintained on GitHub, and can easily be installed by any of the packages that offer installations from GitHub:

### Pick your preference
devtools::install_github("RBigData/pbdIO")
ghit::install_github("RBigData/pbdIO")
remotes::install_github("RBigData/pbdIO")

About

An interface to parallel input output packages with a focus on Single Program/Multiple Data ('SPMD') parallel programming style, which is intended for batch parallel execution.

License:Mozilla Public License 2.0


Languages

Language:R 53.5%Language:TeX 36.1%Language:Shell 10.4%