Support `vars` option in get_file/get_dataframe
kuriwaki opened this issue · comments
Shiro Kuriwaki commented
vars
should be an argument that subsets the columns of the dataset to pull. However, it seems to not affect anything and just returns the whole dataset.
library(dataverse)
df_tab_all <-
get_file_by_name(
filename = "roster-bulls-1996.tab",
dataset = "doi:10.70122/FK2/HXJVJU",
server = "demo.dataverse.org"
)
df_tab_vars <-
get_file_by_name(
filename = "roster-bulls-1996.tab",
dataset = "doi:10.70122/FK2/HXJVJU",
server = "demo.dataverse.org",
vars = c("number", "player") # only two columns
)
# first data should be larger (more data)
stopifnot(object.size(df_tab_all) > object.size(df_tab_vars))
#> Error: object.size(df_tab_all) > object.size(df_tab_vars) is not TRUE
# does it work on get_dataframe?
df_tab_vars <-
get_dataframe_by_name(
filename = "roster-bulls-1996.tab",
dataset = "doi:10.70122/FK2/HXJVJU",
server = "demo.dataverse.org",
vars = c("number", "player") # only two columns
)
#> Downloading ingested version of data with readr::read_tsv. To download the original version and remove this message, set original = TRUE.
#> Rows: 15 Columns: 9
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (6): player, position, height, dob, country_birth, college
#> dbl (3): number, weight, experience_years
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ncol(df_tab_vars)
#> [1] 9
Created on 2022-01-12 by the reprex package (v2.0.1)
EDITED 2021-01-12 with new version of dataverse, which now avoids errors and fixes the reprex.