andybega / rpolecat

Download and work with the POLECAT event data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rpolecat

CRAN status

rpolecat makes downloading and working with the POLECAT event data easier.

The main download functionality works, but there’s more stuff in the works, see our roadmap in #1.

The POLECAT data are provided via two repositories on dataverse:

The data are described in two papers:

Installation

You can install the development version of rpolecat like so:

library(remotes)
install_github("basil-analytics/rpolecat")

rpolecat depends on the R Dataverse Client to interact with the Dataverse API. That package requires two environment variables in order to be able to interact with the API. More details are documented in the dataverse R package readme API Access Keys section.

One way to meet this requirement without have to mess with R every time you start it is to:

  1. Obtain an API access token from Harvard Dataverse.

  2. Add the following lines to your .Rprofile file:

    # dataverse API token  
    Sys.setenv(DATAVERSE_SERVER = "dataverse.harvard.edu")  
    Sys.setenv(DATAVERSE_KEY = "<your API token>")  

    (You can find and open your .Rprofile file using usethis::edit_r_profile() if the usethis package is installed.)

  3. Restart R for the changes to take effect.

Example

One of the main functions of this package is to download the POLECAT data without manual point and clicking on Dataverse:

# (not run)

# one-stop-shop for getting all data and keeping updated with new data:
# 1st time
download_polecat(local_dir = "my/data/dir", skip_exiting = TRUE)
# next times
download_polecat(local_dir = "my/data/dir", years = 2023, skip_existing = TRUE)

# see ?download_polecat

This will download the files as they are, namely the current year weekly files will remain unzipped, and the historical data yearly ZIP archives will remain zipped. However, the “skip_existing” functionality should be able to handle things correctly if you zip/unzip various files, see ?download_polecat for more details.

The package also include information from the PLOVER ontology:

library(rpolecat)

data(contexts)
head(contexts)
#>                  context
#> 1               military
#> 2           intelligence
#> 3              executive
#> 4            legislative
#> 5               election
#> 6 political_institutions

data(modes)
head(modes)
#>   event_type         mode
#> 1    consult        visit
#> 2    consult  third-party
#> 3    consult multilateral
#> 4    consult        phone
#> 5    retreat     withdraw
#> 6    retreat      release

Data license

The data license is viewable on dataverse. We copy it here for convenience:

The POLECAT data are produced by the Program on Geostrategic Risk (formerly the Political Instability Task Force). The Program on Geostrategic Risk is funded by the Central Intelligence Agency. The views expressed are the authors’ alone and do not represent the views of the U.S. Government. We are unable to provide the story text from which events are extracted or the URLs due to licensing restrictions. For any data issues or bug reports please contact the dataset points of contact. THESE MATERIALS ARE SUBJECT TO COPYRIGHT PROTECTION AND MAY ONLY BE USED AND COPIED FOR RESEARCH AND EDUCATIONAL PURPOSES. THE MATERIALS MAY NOT BE USED OR COPIED FOR ANY COMMERCIAL PURPOSES. © 2023 Leidos. All rights reserved. THE MATERIALS ARE PROVIDED ON AN AS-IS BASIS, WITH NO WARRANTIES OR GUARANTIES OF ANY KIND. THE OWNERS WILL NOT BE LIABLE FOR ANY DAMAGES ARISING FROM THEIR USE. USE OF THE MATERIALS IS ENTIRELY AT YOUR OWN RISK.

About

Download and work with the POLECAT event data.

License:Other


Languages

Language:R 100.0%