ljdursi / beyond-single-core-R

Short tour of parallel and foreach packages, and how to think about scaling data analyses

Home Page:https://ljdursi.github.io/beyond-single-core-R

Repository from Github https://github.comljdursi/beyond-single-core-RRepository from Github https://github.comljdursi/beyond-single-core-R

Beyond Single Core: Parallel Analysis in R

R is a great environment for interactive analysis on your desktop, but when your data needs outgrow your personal computer, it's not clear what to do next.

This is material for a short overview of scalable data analysis in R. The slides can be viewed at https://ljdursi.github.io/beyond-single-core-R .

It covers:

  • How to think about parallelism and scalability in data analysis
  • The standard parallel package, including what was the snow and multicore facilities, using airline data as an example
  • The foreach package, using airline data and simple stock data;
  • A summary of best practices.

Included in the materials, though not in the talk, are some more advanced methods:

  • The bigmemory package for out-of-core computation on large data matrices, with a simple physical sciences example;
  • The Rdsm package for shared memory; and
  • a brief introduction to the powerful pbdR pacakges for extremely large-scale computation.

About

Short tour of parallel and foreach packages, and how to think about scaling data analyses

https://ljdursi.github.io/beyond-single-core-R

License:Other


Languages

Language:R 50.3%Language:CSS 34.4%Language:Shell 10.6%Language:Makefile 4.7%