- The International Crisis Behavior Events (ICBe)
- The Paper:
- The Data:
- The Authors:
- Citation:
- Replication Code and Analysis
- data preparation
- Data inputs:
This is a github repository for the International Crisis Behavior Events (ICBe) dataset. Submit any issues regarding the dataset, paper, or github repository using the issues tab.
AS OF JUNE 2024, THE DATAVERSE REPLICATION FILES ARE THE MOST UP-TO-DATE. PLEASE USE THAT REPOSITORY FIRST.
Introducing the ICBe Dataset: Very High Recall and Precision Event Extraction from Narratives about International Crises (also available on ArXiv).
The v1.1 Online Appendix can be downloaded here.
Version 1.1 is the most recent version and was posted on ArXiv on July 26, 2022. Version 1.0 was posted on February 15, 2022.
The agreed datasets are the final dataset used in much of the paper and figures. It includes our best efforts at cleaning the data and reconciling intercoder agreement. The dataset is available in long and wide format. The data is also available in .tsv format in the same folders.
- ICBe_V1.1_long.Rds
- ICBe_V1.1_long_agreement.Rds
- All coded values individually along with information about how often they were selected by coders.
ICBe_V1.1_long.Rds
filtered down only to those codings that were agreed upon (see Algorithm 1 in paper).
- ICBe_V1.1_events_agreed_long.Rds
- ICBe_V1.1_events_agreed.Rds
The coding and cleaning process are described in the paper with additional information and details about the variables in the codebook. - ICBEdataset Codebook
Rex W. Douglass, Thomas Leo Scherer, J. Andrés Gannon, Erik Gartzke, Jon Lindsay, Shannon Carcelli, Jonathan Wilkenfeld, David M. Quinn, Catherine Aiken, Jose Miguel Cabezas Navarro, Neil Lund, Egle Murauskaite, and Diana Partridge.
For any use of the dataset or paper, please cite:
Douglass, Rex W., Thomas Leo Scherer, J. Andrés Gannon, Erik Gartzke, Jon Lindsay, Shannon Carcelli, Jonathan Wiklenfeld, David M. Quinn, Catherine Aiken, Jose Miguel Cabezas Navarro, Neil Lund, Egle Murauskaite, and Diana Partridge. 2022. “Introducing the ICBe Dataset: Very High Recall and Precision Event Extraction from Narratives about International Crises.” arXiv:2202.07081 [cs, stat]. http://arxiv.org/abs/2202.07081.
A description of the file and folders in the repository used to create the datasets, tables, figures.
- download_and_clean
- creates a succinct rds of the crisis narratives:
.replication_corpus/data/out/icb_corpus_V1.0_May_16_2022
- creates a succinct rds of the crisis narratives:
- 01_compile_saves_and_align
- compiles the original coding files into
./replication_data/in/icb_long_spans.Rds
. The original coding files are not on the public repository. Public users will loadicb_long_spans.Rds
directly. - aligns codings from multiple GUI versions on similar source sentences
- compiles the original coding files into
- 02_format_and_clean
- applies cleaning dictionaries to create
./replication_data/out/ICBe_V1.1_long_clean.Rds
.
- applies cleaning dictionaries to create
- 03_aggregation
- applies aggregation algorithm to create
./replication_data/out/ICBe_V1.1_long_agreement.Rds
,ICBe_V1.1_long.Rds
,./replication_data/out/ICBe_V1.1_events_agreed.Rds
,./replication_data/out/ICBe_V1.1_events_agreed_long.Rds
.
- applies aggregation algorithm to create
- 04_validation
- applies iconography to crises and events to create
ICBe_V1.1_crises_markdown.Rds
andICBe_V1.1_events_agreed_markdown.Rds
- applies iconography to crises and events to create
The figures are created in Rmd file for the paper (./replication_paper/pnas_draft/ICBe_pnas_submission_rmd.Rmd). In some cases they have been transformed to other formats via GNU Image Manipulation Program.
case_study_cuban_precision.png
recall_cuban_and_crimea_andcounts.png
- Draws from the Cuban Missile Automated Case Study googlesheet and the Crimea-Donbas Automated Case Study (redone) googlesheet.
- uses the iconographry in
./replication_data/in/flags_small/
p_precision_combined.png
- combines metro maps of the two case studies using ICBe_V1.1_events_agreed.Rds
p_semantic_embeddings_dendro.png
- plot of semantic embeddings of ICBe_V1.1_events_agreed_markdown.Rds
p_precision_icews.png
- mapping of icews using
./replication_paper/data/out/icews_clean_471_lowest.tsv
(created in./replication_paper/pnas_draft/appendix.Rmd
)
- mapping of icews using
- Cleaning dictionaries: used to clean raw codings for actors, actions,
locations, and dates
/replication_paper/data/in/icb_manual_recording_master_sheet.xlsx
- Lit review and tree/leaf codebook:
replication_data/in/icbe_litreview_trees_sentences.xlsx
and/replication_paper/data/in/icbe_litreview_trees_sentences.xlsx
- Case study tables:
/replication_paper/data/in/CaseStudies.xlsx
- The ICB project
- System-level (icb1v14.csv) and Actor-level (icb2v14.csv) datasets
- Dyadic-Level Crisis Data (source)
- Militarized Interstate Disputes (MID) version 5.01 at the incident level (MIDI_5.01.Rds) and incident-participant level (MIDIP_5.01.Rds) converted to Rds.
- UCDP Georeferenced Event Dataset (GED) Global version 21.1 (GEDEvent_v23_1.RData)
- Cameo Event Codes adapted from CAMEO Conflict and Mediation Event Observations Codebook (cameo.eventcode.txt)
- Phoenix Event data
- Terrier event data
- too large to include in the github repository
- to replicate, download the folder ‘largegeolocatedata’ to ICBEdata/replication_paper/data/ignore and decompress
- ICEWS
data
- too large to include in the github repository
- to replicate, download folder ‘dataverse_files’ to ICBEdata/replication_paper/data/ignore/ and decompress
The v1.1 Online Appendix can be downloaded here.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.