shikokuchuo / mirai

mirai - Minimalist Async Evaluation Framework for R

Home Page:https://shikokuchuo.net/mirai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WISH: Support passing name-value pairs via '.args' of mirai()

HenrikBengtsson opened this issue · comments

Picking up on the side-discussion on argument .args that started in #49 (comment).

As it stands now (mirai 0.8.2.9036), the .args argument of mirai() takes a vector of symbols as input. The following are all working examples:

m <- mirai::mirai(2 * a + b, .args = list(a, b))
m$data
## [1] 10
m <- mirai::mirai(2 * a + b, .args = c(a, b))
m$data
## [1] 10
m <- mirai::mirai(2 * a + b, .args = data.frame(a, b))
m$data
## [1] 10

Issues

Too broad acceptance

?mirai::mirai says that .args takes an "(optional) list of objects". The above .args = c(a, b) example does not use a list. One can also debate whether .args = data.frame(a, b) should work (although technically it's also a list).

No validation of '.args'

There is no up-front validation of what is passed to .args. Instead, mis-specification of .args results in run-time errors from worker failing to evaluate the expression. For example,

m <- mirai::mirai(2 * a + b, .args = c("a", "b"))
m$data
## 'miraiError' chr Error in 2 * a: non-numeric argument to binary operator
m <- mirai::mirai(2 * a + b, .args = alist(a, b))
m$data
## 'miraiError' chr Error in 2 * a: non-numeric argument to binary operator
m <- mirai::mirai(2 * a + b, .args = 42)
m$data
'miraiError' chr Error in eval(expr = envir[[".expr"]], envir = envir, enclos = NULL): object 'a' not found

No support for name-value pairs

I think there will be a lot of users that will expect being able to do:

m <- mirai::mirai(2 * a + b, .args = list(a = 3, b = 4))
m$data
'miraiError' chr Error in eval(expr = envir[[".expr"]], envir = envir, enclos = NULL): object 'a' not found

This will allow users to write more clear code. Currently, anyone who wishes to pass a named list of values has to resort to do.call(), which counter to the "neat API" that mirai tries to support. For example, as it stands now, we have to write code like:

expr <- quote(2 * a + b)
globals <- list(a = 3, b = 4)
m <- do.call(mirai, c(list(expr), globals))

instead of a much cleaner and less obscure:

expr <- quote(2 * a + b)
globals <- list(a = 3, b = 4)
m <- mirai(expr, .args = globals)

Suggestion

  1. Redesign current behavior to support only the .args = c(a, b) form.

  2. Add support for specifying name-value elements, i.e. .args = list(a = 3, b = 4).

I think those will be the most common use cases. Using c() for one and list() for the other could help distinguish the two. Of course, both forms could be handled using list() too.

FWIW, the fact that c(a, b) uses symbols has the advantage that static-code inspection (e.g. codetools validation by R CMD check) can pick up typos. In contrast, .args = c("a", "btypo") won't be detected until run-time.

There is no up-front validation of what is passed to .args.

Yes, I guess something is merited here!

This will allow users to write more clear code. Currently, anyone who wishes to pass a named list of values has to resort to do.call(), which counter to the "neat API" that mirai tries to support.

I like this suggestion.

FWIW, the fact that c(a, b) uses symbols has the advantage that static-code inspection (e.g. codetools validation by R CMD check) can pick up typos. In contrast, .args = c("a", "btypo") won't be detected until run-time.

Yes, that is precisely one of my objectives. Also why I construct the environment upfront to detect non-named arguments in ... instead of just passing a list to the daemon process.

Thanks for the insight - much appreciated. I will think on the implementation.

Copying @wlandau. Might just make it in time for mirai 0.8.3.

This is implemented in mirai 0.8.2.9037. Completely non-breaking and very lightweight.

mirai simply reads `.args' as name=value pairs if the list is named. I have chosen to modify the existing argument as it can't break existing code as your 2-argument approach would.

There is validation by is.list(), so passing a single number or c(x, y) no longer works. It doesn't prevent people passing in a data.frame as this is a list anyway, so why not? It also doesn't prevent intentional abuse by passing in alist(), why?!

Your excellent example now works:

expr <- quote(2 * a + b)
globals <- list(a = 3, b = 4)
m <- mirai(expr, .args = globals)

@wlandau FYI this should allow you to simplify your use of do.call() in crew.

Thanks. Not at a computer for a while, but what happens now if you do .args = list(a, b = 4)?

The list is named so it is read as a name = value pair, and it errors when the environment is constructed. The same as it would if you passed a, b = 4 as ....

It is documented. It is strictly an either / or.

So, this works:

$ R --quiet --vanilla
ee <- quote(1+2); m <- mirai::mirai(ee)
m$data
## [1] 3

but not this:

$ R --quiet --vanilla
m <- local({ ee <- quote(1+2); mirai::mirai(ee) })
m$data
## 'miraiError' chr Error in eval(expr = envir[[".expr"]], envir = envir, enclos = NULL): object 'ee' not found

It works if you rename ee to expr or .expr.

FWIW, at least for me, too much NSE:ing tend to bring me down these type of rabbit holes.

EDIT: After submitting this comment, I realize it belongs to #49, but since I've already posted it, #49 is closed, and the updates you've done here might have introduced this, I keep it here.

Thanks so much Henrik! This does indeed belong in #49. The get0() call was simply looking in the wrong scope (hence also the reason it was spuriously picking up expr and .expr). This is now corrected in mirai 0.8.2.9038.

I think it's worth the effort getting these things right - and the process has been shortcut with an expert eye such as yours.

Thank you. Works like a charm now.