jrcpulliam / reinfections

SARS-CoV-2 reinfection trends in South Africa: analysis of routine surveillance data (Pulliam et al. 2021)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SARS-CoV-2 reinfection trends in South Africa: analysis of routine surveillance data

This repository provides code and data for the analyses presented in:

Pulliam, JRC, C van Schalkwyk, N Govender, A von Gottberg, C Cohen, MJ Groome, J Dushoff, K Mlisana, and H Moultrie. (2022) Increased risk of SARS-CoV-2 reinfection associated with emergence of Omicron in South Africa. Science DOI: 10.1126/science.abn4947

The materials in this repository are made available under a CC-BY-NC 4.0 International License. See the LICENSE file for additional information.

Note: An early release this repository (v1.0) contained a large simulation output file (>50 MB), which caused it to take a long time to download / clone when using a slow internet connection. This and another moderately large file have been removed from the repository but can be downloaded from elsewhere (see the Output files section for more details) or recreated using the code provided.

If you have questions or comments, please contact the repository maintainer, Juliet Pulliam, at pulliam@sun.ac.za.

Software requirements

  • R - a statistical programming language (download links for Windows, Linux, and MacOS)
  • R Studio (recommended)- a user interface for R (download link)

The following R packages are required to run the code in this repository (version numbers indicate the versions used for manuscript preparation):

  • coda (0.19-4)
  • colorspace (2.0-1)
  • data.table (1.14.0)
  • ggplot2 (3.3.4)
  • hexbin (1.28.2)
  • jsonlite (1.7.2)
  • lme4 (1.1-27.1)
  • Matrix (1.3-4)
  • patchwork (1.1.1)
  • uniformly (0.1.0)

Pipeline files

The files listed below are located in the main directory:

  • Makefile - full pipeline via GNU Make (requires use of Unix-like command line); see https://www.gnu.org/software/make/ for more information
  • pub.json - configuration file used for manuscript preparation
  • test.json - test configuration file (useful to see how the code works without requiring intensive computation)
  • reinfections_pub.Rproj - R project file, which can be used for easy file navigation in interactive mode
  • LICENSE - license information
  • README.md - this file
  • .gitignore - specifies files and file types for version control to ignore

Data files

The files listed below are located in the data subdirectory:

  • ts_data.csv - national daily time series of newly detected putative primary infections (cnt), suspected second infections (reinf), suspected third infections (third), and suspected fourth infections (fourth) by specimen receipt date (date)
  • demog_data.csv - counts of individuals eligible for reinfection (total), who have 0 suspected reinfections (no_reinf) or >0 suspected reinfections (reinf) by province (province), age group (5-year bands, agegrp5), and sex (M = Male, F = Female, U = Unknown, sex)

Derived data files created by certain scripts will also be placed in this subdirectory.

Code files

The files listed below are located in the code subdirectory:

Data preparation scripts

  • prep_ts_data.R - creates an RDS file with time series data (used in analysis / plotting scripts)
  • prep_demog_data.R - creates an RData file with counts by province and counts by age group / sex combination (used in analysis / plotting scripts)

Files generated by these scripts will be placed in the data subdirectory and will be ignored by the version control system.

Utility scripts

  • install.R - checks for required packages and installs if not present
  • empirical_hazard_fxn.R - utility functions for empirical hazard estimation (approach 2)
  • fit_fxn_null.R - utility functions for likelihood calculations (approach 1)
  • plotting_fxns.R - utility functions for formatting plots
  • wave_defs.R - utility functions for defining wave periods

Utility functions generated by these scripts will be placed in the utils subdirectory and will be ignored by the version control system.

Analysis and visualization scripts

Descriptive:

  • ts_plot.R - creates time series plot (Figure 1)
  • demog_plot.R - creates descriptive analysis plot (Figures S1 and S2, panel A)

Approach 1:

  • mcmc_fit.R - implements MCMC fitting for approach 1
  • sim_null.R - simulates projections from the null model for approach 1 using a simplified simulation approach
  • sim_null_dyn.R - simulates projections from the null model for approach 1 using an approach that includes dynamical noise (Note: Not used; see file for additional information.)
  • sim_plot.R - creates plot of observed data with model fits and projections using approach 1 (Figure 3)
  • convergence_plot.R - creates plot of convergence diagnostics using output of the MCMC fitting procedure (Figure S4)

Approach 2:

  • emp_haz_plot.R - creates empirical hazards plot using approach 2 (Figure 5)
  • reconstruct_data_for_reg.R - creates reconstructed data set using model for approach 2, to be used in regression analysis
  • reg_out.R - conducts Poisson regression analysis and outputs estimates of coefficients with 95% confidence intervals
  • sens_an.R - conducts sensitivity analysis to assumptions about observation probabilities for approach 2
  • sens_an_plot.R - creates sensitivity analysis plot (Figure S8)

Output files

Files listed below are not included in the repository but can be downloaded and placed in the output subdirectory:

Files generated by the code in this repository will also generally be placed in this subdirectory (with exceptions as noted above).

Releases

Code releases

Data releases

The most up-to-date data (only) are available on Zenodo at DOI: 10.5281/zenodo.5745338.

Other information

The manuscript was prepared using the following configuration:

  • R version: 4.0.5 (2021-03-31)
  • Platform: x86_64-apple-darwin17.0 (64-bit)
  • Running under: macOS Catalina 10.15.7

About

SARS-CoV-2 reinfection trends in South Africa: analysis of routine surveillance data (Pulliam et al. 2021)

License:Other


Languages

Language:R 93.8%Language:Makefile 6.2%