Improve doc on how to read objects without object assignment
kuriwaki opened this issue · comments
RData files cannot be read in as an object, but instead are simply released on to the user environment. I think we should all be switching to Rds (see IQSS/dataverse#7249) but nonetheless, some files on Dataverse are uploaded as .RData.
It turns out there are two ways to load this. One is the old way to write the binary file and re-read it with a different function. Another is to create a mini environment within a function, as I found on Stack Overflow. See both in the reprex below. I get identical objects.
We should update the doc with an example.
h/t @jonrobinson2
library(dataverse)
library(fs)
# Algara dataset
# https://dataverse.harvard.edu/file.xhtml?fileId=5028532&version=1.0
# 1. writing and saving as binary works
as_binary <- get_file_by_id(file = 5028532, server = "dataverse.harvard.edu")
temp <- tempdir()
writeBin(as_binary, path(temp, "county.RData"))
load(path(temp, "county.RData"))
str(pres_elections_release)
#> 'data.frame': 113756 obs. of 20 variables:
#> $ election_year : num 1868 1872 1876 1880 1884 ...
#> $ fips : chr "01001" "01001" "01001" "01001" ...
#> $ county_name : chr "AUTAUGA" "AUTAUGA" "AUTAUGA" "AUTAUGA" ...
#> $ state : chr "AL" "AL" "AL" "AL" ...
#> $ sfips : chr "01" "01" "01" "01" ...
#> $ office : chr "PRES" "PRES" "PRES" "PRES" ...
#> $ election_type : chr "G" "G" "G" "G" ...
#> $ seat_status : chr "Open Seat" "Republican President Re-election" "Open Seat" "Open Seat" ...
#> $ democratic_raw_votes : num 851 669 804 978 911 ...
#> $ dem_nominee : chr "Horatio Seymour" "Horace Greeley" "Samuel J. Tilden" "Winfield Scott Hancock" ...
#> $ republican_raw_votes : num 1505 1593 1576 974 877 ...
#> $ rep_nominee : chr "Ulysses S. Grant" "Ulysses S. Grant" "Rutherford B. Hayes" "James A. Garfield" ...
#> $ pres_raw_county_vote_totals_two_party: num 2356 2262 2380 1952 1788 ...
#> $ raw_county_vote_totals : num 2356 2262 2380 1967 1789 ...
#> $ county_first_date : Date, format: "1818-11-21" "1818-11-21" ...
#> $ county_end_date : Date, format: NA NA ...
#> $ state_admission_date : chr "1819-12-14" "1819-12-14" "1819-12-14" "1819-12-14" ...
#> $ complete_county_cases : num 1 1 1 1 1 1 1 1 1 1 ...
#> $ original_county_name : chr NA NA NA NA ...
#> $ original_name_end_date : Date, format: NA NA ...
# 2. how about directly into R? This is a Rdata file, which we often read by load().
# via: https://stackoverflow.com/questions/34925668/r-assign-content-from-rda-object-with-load
load_object <- function(file) {
tmp <- new.env()
load(file = file, envir = tmp)
tmp[[ls(tmp)[1]]]
}
as_rda <- get_dataframe_by_id(file = 5028532,
server = "dataverse.harvard.edu",
.f = load_object,
original = TRUE)
identical(as_rda, pres_elections_release)
#> [1] TRUE
Created on 2021-09-16 by the reprex package (v2.0.1)
Implemented in 0.3.14