Support `versions` argument in get_file/get_dataframe
kuriwaki opened this issue · comments
Shiro Kuriwaki commented
For datasets that periodically update (e.g. this is on V6), it would be good for the get_*
functions to have an option to specify the version number.
Looks like there is an API for that
https://guides.dataverse.org/en/latest/api/dataaccess.html#download-by-dataset-by-version
However, see #27 for an error in dataset_versions
.
Shiro Kuriwaki commented
We already seem to have this, it's just not documented and was passed through ...
. But files that no longer exist in :latest
are not caught. Should be ready after fixing that.
library(dataverse)
library(readr)
packageVersion("dataverse")
#> [1] '0.3.10'
# setup
doi <- "doi:10.70122/FK2/PPIAXE"
Sys.setenv(DATAVERSE_SERVER = "demo.dataverse.org")
fun <- function(x) read_tsv(x, col_types = cols())
# Expected Success
d1 <- get_dataframe_by_name("nlsw88.tab", doi, .f = fun, version = 1)
d2 <- get_dataframe_by_name("nlsw88.tab", doi, .f = fun, version = 1.1)
# Expected ERROR - version 5 does not exist
d3 <- get_dataframe_by_name("nlsw88.tab", doi, .f = fun, version = 99)
#> Error in dataset_files(prepend_doi(x), key = key, server = server, ...): Not Found (HTTP 404). Failed to Dataset version 99 of dataset 1734015 not found.
# ERROR to fix
# A filename that no longer exists on latest version
# https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GDF6Z0&version=4.0
cc <- get_dataframe_by_name(
filename = "CCES16_Common_OUTPUT_Jul2017_VV.tab",
version = 4,
dataset = "10.7910/DVN/GDF6Z0",
server = "dataverse.harvard.edu",
original = TRUE,
.f = haven::read_dta)
#> Error in get_fileid.character(x = dataset, file = filename, ...): File not found
Created on 2022-01-12 by the reprex package (v2.0.1)