tidyverse / purrr

A functional programming toolkit for R

Home Page:https://purrr.tidyverse.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Better document downside of named elements in `list_cbind()`

rressler opened this issue · comments

list_cbind calls vec_cbind which produces "packed data frame columns" with named inputs.

I am not familiar with packed data frame columns so was expecting to get a data frame with columns that were atomic vectors. The example from the list_cbind help (below) shows a different structure. This different structure also means the result of list_cbind displays differently in an R script than in a qmd/rmd code chunk .

x2 <- list(
  a = data.frame(x = 1:2),
  b = data.frame(y = "a")
)
list_cbind(x2) 
str(list_cbind(x2)) 
list_cbind(x2) |> unpack(cols = everything()) |> str()

Suggest adding additional information or suggestions in the help (or a vignette) on using unpack() how to convert the result of list_cbind into an unpacked data frame.

An alternative might be to add unpack to the list_cbind() as an argument to allow users to unpack as part of the function.

What are you trying to do? It's most likely that you should avoid the packing step in the first place.

Thanks for responding (and everything else you do)!

I am not sure how to avoid the packing as

  1. list_cbind() combines elements into a data frame by column-binding them together with vctrs::vec_cbind() and then

  2. vec_cbind() creates packed data frame columns with named inputs.

I have not found a source that explains packed data frames or how to pass arguments to vec_cbind to unpack them as part of the functional call to list_cbind. Perhaps a vignette could be added to tidyr.

We can just pipe to flatten I guess but that seems to be an extra step that the newer functions such as pivot_wider etc try to avoid by using arguments.

Appreciate any recommendations.
Thanks,
Richard

Oooh sorry, I think I missed the underlying issue here. You're getting this behaviour because list_cbind() is attempting to preserve the internal and external names. You can get the behaviour you want by stripping the names:

library(purrr)

x2 <- list(
  a = data.frame(x = 1:2),
  b = data.frame(y = "a")
)
str(list_cbind(unname(x2)))
#> 'data.frame':    2 obs. of  2 variables:
#>  $ x: int  1 2
#>  $ y: chr  "a" "a"

Created on 2023-07-27 with reprex v2.0.2

I'll think about how to point this out in the docs.

Thanks for the explanation and the solution approach!

p.s. FYI even Chat GPT can't explain packed data.frames.:)

What is a good reference to explain the "packed data.frame" in R?
ChatGPT
As of my last update in September 2021, there isn't a native data structure called "packed data.frame" in R. ...