mschubert / ebits

R bioinformatics toolkit incubator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

override of list (with a python-like list)

barzine opened this issue · comments

The following function was created by Gabor Grothendieck
who provided it graciously on the r-help list in June 2004.
https://stat.ethz.ch/pipermail/r-help/2004-June/053343.html
This function allows you to have multiple variables as output

He originally named the function list (overriding then the built-in function).
However, I have preferred to rename it for avoiding confusion.

pyList<- structure(NA,class="result")
"[<-.result" <- function(x,...,value) {
    args <- as.list(match.call())
    args <- args[-c(1:2,length(args))]
    length(value) <- length(args)
    for(i in seq(along=args)) {
        a <- args[[i]]
        if(!missing(a)) eval.parent(substitute(a <- v,list(a=a,v=value[[i]])))
    }
    x
}

Another (maybe more general option) would be to define an operator that imitates python's ** by taking the elements of a list and assigning them in the parent scope?

If this works in function calls as well it could also be more intuitive than do.call(...) (especially when having to concatenate lists to construct the call).

Let’s do this systematically. I’ll use Python for comparison here but the same works similarly in other interpreted languages.

  1. * and ** args in Python correspond to ... in R:

    def f(*args): return args
    def g(**kwargs): return kwargs
    
    f(1, 2, 3) # => (1, 2, 3)
    g(a = 1, b = 2) # => {'a': 1, 'b': 2}
  2. * and ** argument passing corresponds to do.call in R:

    def h(a, b, c): return a, b, c
    
    h(*[1, 2, 3]) # => (1, 2, 3)
    h(**{'b': 2, 'a': 1, 'c': 3}) # => (1, 2, 3)

    In R, this would simply be do.call(h, list(a = 1, b = 2, c = 3)), for instance.

  3. Assignment unpacking is what we’re talking about here.

    a, b, c = h(1, 2, 3)

    This is the only operation that doesn’t have an equivalent in R, as I see it.

@mschubert, I’m not sure what you’ve had in mind specifically for do.call (although I do agree that it is somewhat annoying to use), and how to improve it via a Python-like splat operator.

That said, I’d like a do.call replacement which worked with non-list arguments. It’s also possible to envision an operator syntax (e.g. f %()% args).

So I tried playing around with this, but modules complains about the fact that the structure is locked.

Any way around this?

We should coordinate better (that’s why I assigned this to myself …). 😉

I’ve pushed the branch with my experiments. But yes, this doesn’t actually work with modules, because objects are locked (for good reason!) and R insists on performing a useless assignment in this code.

Incidentally, the code still works, despite the error message. But to use it properly, one of two things has to be done:

  • Client-side solution (already works with the current code):

    call = import('ebits/base/call')
    unpack = call$unpack
    unpack[x, y] = 1 : 2

    However, this isn’t perfect since it requires client-side set up (second line).

  • Use an environment inside the module (requires a trivial code change):

    call = import('ebits/base/call')
    call$x$unpack[x, y] = 1 :2

    Here, x is defined inside the module as x = list2env(list(unpack = unpack)). This means that, even though x itself is locked when being exported from the module, its contents isn’t. This is the established way in R for packages and modules to be stateful. The obvious disadvantage is that the API makes no sense whatsoever from the user’s perspective. Vetoed.

A third alternative would be to modify “modules” to allow objects to be modified. However, I consider this a very bad idea in general, and I cannot think of any other use-case.

How about using option 2 and using call instead of x in base instead of call (basically, having call as a list instead of a submodule)? Then you could write:

b = import('ebits/base')
b$call$unpack[x, y] = 1 :2

You cold also add other functions to the call list (that would be modifiable as well - not sure how much that matters). Granted, that's a bit hacky, but the API is clean.

Alternatively, I could picture a modifiable() wrapper for modules that doesn't lock a given function or object - but that would make more sense if there are a couple of use cases, as you say as well.

Good approach. However, how do we document this? Modules will never search nested environments for documentation.

Only easy option would be to document call and hope the users figure out the rest - not ideal but workable for individual cases (until there comes one, two other use cases for having modifiable objects in modules).

Coming back to this: wouldn't it be much easier to implement it as a function?

I still see value in this, e.g. combining with io$load() and exporting selected objects.

`unpack<-` = function(...) {
    # take rhs and export to `parent.frame()`
    ...
}

unpack(a, c=b) = some_fun_returning_a_and_b()

@mschubert Try it — this won’t work in the general case unless a has been declared beforehand, due to how R implements assignment.

For future reference, a more ergonomic unpack syntax (that does not pollute the calling environment with bogus objects!) is described in my package ‘unpack’. Note that this is a proof of concept implementation, and is not intended for production use. In particular, it does not support nested structures or named (“destructuring”) unpacking.