tidyverse / magrittr

Improve the readability of R code with the pipe

Home Page:https://magrittr.tidyverse.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regression: Pipes interaction working on v1.5 but not on development version

iago-pssjd opened this issue · comments

After reading https://www.tidyverse.org/blog/2020/08/magrittr-2-0/ I decided to test new magrittr version. I installed it yesterday from github. I found now what seems to be a regression.

Next code (original from SO) works on magrittr 1.5 (I tried it even on https://rstudio.cloud)

data <- tibble::tribble(
   ~last_name, ~first_name, ~pitcher, ~ff_avg_spin, ~si_avg_spin, ~fc_avg_spin, ~sl_avg_spin, ~ch_avg_spin, ~cu_avg_spin, ~fs_avg_spin,
      "Bauer",    "Trevor",   545333,         2286,         2276,         2539,         2687,         1441,         2464,           NA,
      "Rodon",    "Carlos",   607074,         2148,         2211,         2153,         2465,         1725,         2457,         2630,
  "Verlander",    "Justin",   434378,         2583,           NA,         2595,         2626,         1870,         2796,           NA
  )
library(dplyr)
library(tidyr)
library(magrittr)
data %T>% 
  {nspec <<- build_longer_spec(.,
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin") %>%
    mutate(pitch_type = sub("_spin", "", pitch_type))} %>%
  pivot_longer_spec(spec = nspec, values_drop_na = TRUE)

# A tibble: 18 x 5
   last_name first_name pitcher pitch_type avg_spin
   <chr>     <chr>        <dbl> <chr>         <dbl>
 1 Bauer     Trevor      545333 ff_avg         2286
 2 Bauer     Trevor      545333 si_avg         2276
 3 Bauer     Trevor      545333 fc_avg         2539
 4 Bauer     Trevor      545333 sl_avg         2687
 5 Bauer     Trevor      545333 ch_avg         1441
 6 Bauer     Trevor      545333 cu_avg         2464
 7 Rodon     Carlos      607074 ff_avg         2148
 8 Rodon     Carlos      607074 si_avg         2211

But it does not work with the current development version:

Error in is.data.frame(spec) : object 'nspec' not found

I use often this kind of code, and I think it is very useful since it allows to modify suitably the pivot_wider/pivot_longer specs or to create other auxiliar variables (in place of nspec) without breaking the pipes workflow.

Thank you!

This is a consequence of lazy evaluation because pivot_longer_spec() checks its other inputs before forcing the main argument. Because of lazy evaluation the following pattern is unreliable in R because there's no guarantee that foo() will first evaluate the first argument:

foo(x <- 1, x)

Note that the pattern you're using will also not work with the proposed base R pipe which will also be lazy.

What is your motivation for using this pattern? Is it just to spare an intermediate value? My feeling is that the pipe should not be overdone, using <- here and there with good names really helps making code readable. Using <<- inside { inside %>% is probably not good practice.

Thanks for the answer, @lionel- .

On my motivation, sometimes I have to execute really large quantities of code multiple times. To be faster (avoiding to run multiple instructions/assignments separately or having to select them previous to run) I see 3 options: either prepare alll the code in a script and execute the script, or separate the instructions by ; or, as far as possible, collecting most consecutive instructions in a pipe workflow. Maybe the "good practice" is the first option, but the pipes workflow is more comfortable, at least for me, and allows to change the code on the fly easily.

Actually, I think I have just found a solution, since the code

data %T>% 
  {nspec <<- build_longer_spec(.,
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin") %>%
    mutate(pitch_type = sub("_spin", "", pitch_type))} %>%
  {pivot_longer_spec(., spec = nspec, values_drop_na = TRUE)}

seems to work with the dev version. (even though, by your answer, I'm not sure to understand why it works?)

Please, let me know any other options you know to proceed in this way in this context of lazy evaluation. Otherwise, feel free to close.

Thank you for the details. I would be surprised if extra assignments are a bottleneck though. They might prevent the intermediary value from being collected for a little longer, but unless you're working with extremely large values that shouldn't be an issue.

seems to work with the dev version. (even though, by your answer, I'm not sure to understand why it works?)

I think it works because it expands to this: { .; pivot_longer_spec(.) }. Curly blocks are not treated specially in magrittr, so the piped value is added to the block as first argument. It usually doesn't do anything, but in this case it forces evaluation of the input. Nice finding!

I think it works because it expands to this: { .; pivot_longer_spec(.) }. Curly blocks are not treated specially in magrittr, so the piped value is added to the block as first argument. It usually doesn't do anything, but in this case it forces evaluation of the input. Nice finding!

ah I see that @egnha and @hadley just discussed this in #224.

Thank you! I will go on testing dev magrittr and I will adapt my code in this way when necessary. If I find any other issue I'll let you know.