tidyverse / tibble

A modern re-imagining of the data frame

Home Page:https://tibble.tidyverse.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tibble allows to set a column 'names' attribute while data.frame does not

AnBarbosaBr opened this issue · comments

I´m not sure if this is expected behaviour, but I find it counterintuitive, since tibbles do not even have rownames.

This property also leads to bigger datasets both in memory and when stored in certain formats. It also caused me some trouble when trying to save the tibble as a parquet file using arrow, and this may happen with other third party packages that do not expect to find this kind of metadata.

n = 10
a = 1:n
b = 1:n
c = 1:n

x = data.frame(a, b, c)

data_frame_example = as.data.frame(x)

names(data_frame_example$a) = 1:n
sapply(data_frame_example, function(n) length(names(n)))
#> a b c 
#> 0 0 0

tibble_example = tibble::as_tibble(x)

names(tibble_example$a) = 1:n
sapply(tibble_example, function(n) length(names(n)))
#>  a  b  c 
#> 10  0  0


pryr::object_size(tibble_example)
#> Registered S3 method overwritten by 'pryr':
#>   method      from
#>   print.bytes Rcpp
#> 2.06 kB

pryr::object_size(data_frame_example)
#> 1.13 kB

# A more 'real world' example
model = glm(c~a)
tibble_example = modelr::add_predictions(tibble_example, model)
tibble_example$pred2 = predict(model, tibble_example)
sapply(tibble_example, function(n) length(names(n)))
#>     a     b     c  pred pred2 
#>    10     0     0    10    10

Created on 2022-06-30 by the reprex package (v2.0.0)

Thanks. This is intended, see https://tibble.tidyverse.org/reference/tibble.html:

Inner names in columns are left unchanged.