IQSS / dataverse-client-r

R Client for Dataverse Repositories

Home Page:https://iqss.github.io/dataverse-client-r

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support `vars` option in get_file/get_dataframe

kuriwaki opened this issue · comments

vars should be an argument that subsets the columns of the dataset to pull. However, it seems to not affect anything and just returns the whole dataset.

library(dataverse)

df_tab_all <-
  get_file_by_name(
    filename = "roster-bulls-1996.tab",
    dataset  = "doi:10.70122/FK2/HXJVJU",
    server   = "demo.dataverse.org"
  )

df_tab_vars <-
  get_file_by_name(
    filename = "roster-bulls-1996.tab",
    dataset  = "doi:10.70122/FK2/HXJVJU",
    server   = "demo.dataverse.org",
    vars = c("number", "player") # only two columns
  )

# first data should be larger (more data)
stopifnot(object.size(df_tab_all) > object.size(df_tab_vars))
#> Error: object.size(df_tab_all) > object.size(df_tab_vars) is not TRUE


# does it work on get_dataframe?
df_tab_vars <-
  get_dataframe_by_name(
    filename = "roster-bulls-1996.tab",
    dataset  = "doi:10.70122/FK2/HXJVJU",
    server   = "demo.dataverse.org",
    vars = c("number", "player") # only two columns
  )
#> Downloading ingested version of data with readr::read_tsv. To download the original version and remove this message, set original = TRUE.
#> Rows: 15 Columns: 9
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (6): player, position, height, dob, country_birth, college
#> dbl (3): number, weight, experience_years
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ncol(df_tab_vars)
#> [1] 9

Created on 2022-01-12 by the reprex package (v2.0.1)

EDITED 2021-01-12 with new version of dataverse, which now avoids errors and fixes the reprex.