wahani / modules

Modules in R

Home Page:https://cran.r-project.org/package=modules

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Attaching base packages to a module

wahani opened this issue · comments

Thanks! Generally-speaking, I wonder if it'd be useful to have a separate command, like modules::import_defaults() that uses the modules-centric way to import all of the standard default packages (perhaps with the exception of datasets) at the start of some module (whether in-line or as a script-file).

Pros:

  • Not requiring users to try to figure out which 'standard' command is in which 'standard' package. For example, is mean() in base? Is it in utils? Is it in stats (because the mean, after all, is a type of summary statistic)? (Spoiler, it's in base.) What about runif()? Sampling a random value uniformly between [0,1] is so basic to programming languages, it might be in base, right? Nope, it's in stats. Well, then it makes sense that sample() is probably in stats, too, right? Nope, it's in base. 🤦
  • For any module that eventually tries to plot() something, requiring like the explicit import of both graphics and grDevices is likely going to trip-up lots of users.
  • ^ Likewise with needing to import methods if those users want to try some S4OO programming in their module. (This'll bring traumatizing flashbacks to old-timers who remember the days of needing to library(methods) at the start of every script for R < 3.0 :-)

Cons:

  • It attaches lots to the module's namespace. (Though I'm not sure if there's really any extra memory usage, since these are all loaded into memory already when starting R, unless one's overridden the options()$defaultPackages, in which case I'd argue such power-users already know what they're doing.)
  • If the default list of packages ever changes in a future R release, this can cause some mild confusion. Though I'd argue that if the default list ever changes, confusion will happen no matter what, whether a modules user or not.

Also, I'm just now reading #14, and in the stats::lag/dplyr::lag example, wouldn't ordered imports in modules still provide masking? E.g.

modules::import("stats")
modules::import("dplyr")
lag  ## dplyr's lag, not stats's lag.

If someone really wanted to use stats's lag(), they could bypass what they've imported and just refer to it explicitly, with stats::lag(), no?
This does introduce complexity of masking within the modules paradigm and how to refer to one-vs-the-other object specifically within imported modules (e.g. user-defined functions), since the :: notation won't work. So, file that problem under the "Con" department, I guess :-/
Though, this is only a problem if people aren't assigning modules to variables, in which case they're inviting their own confusion; so I'd still consider 'attaching' the default packages' contents then emphasizing that users should import modules with explicit variable assignment a la foo <- modules::import("foo").

Originally posted by @mmuurr in #13 (comment)

See my general notes on this topic in #13 (comment)

We may add a convenience wrapper around all base/default packages and call import sequentially. That would be trivial to implement. Not sure if there is more to it though.

The current workaround I've come up with is that I've created a 'base' script module that is effectively boilerplate to be used at the start of any other script module.
Here's the base module, modbase.R:

r_defaults <- function(except = c("datasets"), where = parent.frame()) {
  pkgs <- options()$defaultPackages
  pkgs <- setdiff(pkgs, except)
  if ("methods" %in% pkgs) pkgs <- unique(c("methods", pkgs))
  for(pkg in pkgs) do.call(modules::import, list(str2lang(pkg)), envir = where)
}

modules::export(r_defaults)

As you can see, it does very little other an expose a function that will handle calling modules::import on R's default packages (with the exception of except), and it specifically calls modules::import from where (i.e. the function's caller).

One uses this in another script module like so; let mod-a.R be my module:

modules::use("modbase.R")$r_defaults()
f <- function() runif(10)
modules::export(f)

Now, in a new R session, if one loads "mod-a.R", they'll get access to f which has access to package stats, so runif() will resolve correctly:

a <- modules::use("mod-a.R")
a$f()  ## returns 10 random values

Any other modules can simply include modules::use("modbase.R")$r_defaults() at the top and they'll have access to the standard R library (without having to remember which standard function is in which standard package).

Thank you for the input. We may add

  • importDefaultPackages

to the package. Why would you want to exclude something, e.g. datasets?

For compact module use, I figured datasets isn't all that helpful to have on the search path, that's all. So while deciding to remove one potential package, I figured, why not just make it general-enough to remove any of the default packages. No better reason 🤷‍♂️

@mmuurr I managed to prepare a PR for this. You can install this version with

devtools::install_github("wahani/modules", "import-default-packages")