ukhsa-collaboration / excess-deaths

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Description

This repository contains the analysis that produces the weekly excess deaths estimates published by Public Health England.

This README file contains instructions for how to use the repository including:

Data requirements

These reports require:

  • record level data for deaths, which include information on age, sex, resident geography, ethnicity, place of death and cause of death. These data cover the time period January 2015 to December 2019, and then deaths from March 20th 2020 onwards
  • population estimates for the same time periods described above containing the same demographic information as above

The methodology document describes the data used in more detail.

Guidance on how to produce the reports

The reports are built in two stages. Stage 1 creates and saves the model objects that are based on historic data. These objects are then used in Stage 2 for predicting expected deaths.

Stage 1: R/create_all_models.R

This script uses get_denominators() from the R/function_deaths_data.R file, and create_baseline() from the R/function_modelling.R file. It imports daily deaths data from the data repository and builds and saves model objects.

Stage 2: england_update.R or region_update.R

The england_update.R file in the root directory uses generate_visualisations() from R/function_visualisations.R file. This function uses the model outputs from Stage 1 to predict daily deaths, along with prediction intervals, for recent weeks. This stage only occurs if the predictions for the current month haven’t been created before. If they haven’t, the predictions get stored in csv files in a folder called predictions in specified location in a shared area. If they have already been generated, these files are read straight into the R session. The function then imports death data from the recent deaths database, which are plotted alongside the predicted deaths. The output from generate_visualisations() is either a list of ggplot objects, which then get passed into report/region_report.Rmd to produce the html report, or the path to a csv file containing expected, registered and covid deaths by week for a particular population subgroup.

Structure of the repository

+-- R/
|   +-- create_all_models.R - script that creates the underpinning models for the national report
|   +-- create_report_data.R - a single function that creates all of the data outputs for the weekly report
|   +-- function_deaths_data.R - functions to import deaths data from the data repository in different cuts
|   +-- function_modelling.R - a function to create model objects for each subsection of the population
|   +-- function_monthly_populations.R - a function to generate monthly populations from the data repository
|   +-- function_predictions.R - function to apply models generated from function_modelling.R to make predictions for recent dates
|   +-- function_visualisations.R - function that combines predictions from function_predictions.R with actual deaths data from the daily deaths database, to produce the visualisations for reporting
|   +-- libraries.R - script that loads all the libraries used in the project
|   +-- mentions_models.R - script that creates the underpinning models for the cause of death section in the national report
|   +-- utils.R - some general utility functions
|   +-- utils_charts.R - charting functions for the report
|   +-- utils_phecharts.R - branding functions
+-- data/ - contains a place of death lookup file, and population estimates for regions by age, sex, deprivation and ethnicity
+-- tests/ - test-assertr.R; some functions included to ensure data processing within functions is working as expected
+-- renv/ - automatically generated using the renv package, no need to modify
+-- report/ 
|   +-- gov_uk.Rmd- England report
|   +-- region_report.Rmd - regional report
|   +-- supporting files for aesthetics for Rmd documents
+-- england_update.R - the script that generates the report for England
+-- region_update.R - the script that generates the regional reports

Recreating package versions using renv

Package versioning with this project is being done using the renv package. If this package isn’t installed already then it will be installed upon opening the mortality-surveillance.Rproj file.

To ensure versions used locally with this project are consistent, the user must run the following line once in the console:

renv::restore()

Contributors

  • Sharmani Barnard, Senior Statistical Advisor
  • Sebastian Fox, Principal Data Scientist
  • Zachary Waller, Senior Data Scientist
  • Leigh Dowd, Senio Public Health Intelligence Analyst
  • Allan Baker, Deputy Head of Population Health Analysis
  • Paul Burton, Professor of Data Science for Health at Newcastle University and Honorary Consultant in Public Health (Epidemiology and Statistics) to PHE
  • Peter Goldblatt, Senior advisor University College London Institute of Health Equity and Statistical Advisor to PHE
  • Paul Fryers, Head of Public Health Data Science
  • Justine Fitzpatrick, Head of Population Health Analysis

About

License:MIT License


Languages

Language:R 100.0%