tibble allows to set a column 'names' attribute while data.frame does not
AnBarbosaBr opened this issue · comments
I´m not sure if this is expected behaviour, but I find it counterintuitive, since tibbles do not even have rownames.
This property also leads to bigger datasets both in memory and when stored in certain formats. It also caused me some trouble when trying to save the tibble as a parquet file using arrow, and this may happen with other third party packages that do not expect to find this kind of metadata.
n = 10
a = 1:n
b = 1:n
c = 1:n
x = data.frame(a, b, c)
data_frame_example = as.data.frame(x)
names(data_frame_example$a) = 1:n
sapply(data_frame_example, function(n) length(names(n)))
#> a b c
#> 0 0 0
tibble_example = tibble::as_tibble(x)
names(tibble_example$a) = 1:n
sapply(tibble_example, function(n) length(names(n)))
#> a b c
#> 10 0 0
pryr::object_size(tibble_example)
#> Registered S3 method overwritten by 'pryr':
#> method from
#> print.bytes Rcpp
#> 2.06 kB
pryr::object_size(data_frame_example)
#> 1.13 kB
# A more 'real world' example
model = glm(c~a)
tibble_example = modelr::add_predictions(tibble_example, model)
tibble_example$pred2 = predict(model, tibble_example)
sapply(tibble_example, function(n) length(names(n)))
#> a b c pred pred2
#> 10 0 0 10 10
Created on 2022-06-30 by the reprex package (v2.0.0)
Thanks. This is intended, see https://tibble.tidyverse.org/reference/tibble.html:
Inner names in columns are left unchanged.