tenglongli / EpiNow2

Estimate Realtime Case Counts and Time-varying Epidemiological Parameters

Home Page:https://epiforecasts.io/EpiNow2/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EpiNow2: Estimate realtime case counts and time-varying epidemiological parameters

R-CMD-check Codecov test coverage DOI

This package is under development.

This package estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools and current best practices. It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is under active development. It estimates the time-varying reproduction number on cases by date of infection (using a similar approach to that implemented in the {EpiEstim}). Imputed infections are then mapped to observed data (for example cases by date of report) via a series of uncertain delay distributions (in this example an incubation period and a reporting delay) and a reporting model that can include weekly periodicity. The default model uses a non-stationary Gaussian process to estimate the time-varying reproduction number but optionally a stationary Gaussian process may be used (faster to estimate but reduced performance for real time estimates) and arbitrary breakpoints can be defined. Propagating uncertainty from all inputs into the final parameter estimates (helping to mitigate spurious findings) is handled internally. Time-varying estimates of the rate of growth are derived from the time-varying reproduction estimates and the uncertain generation time. Optionally, the time-varying reproduction number can be forecast forwards in time using an integration with the {EpiSoon} package and converted to a case forecast using the renewal equation. Alternatively, the time-varying reproduction number and cases can be forecast using a Gaussian process. As a standalone tool non-parametric back-calculation is also supported using a novel formulation based on a smoothed mean delay shift of reported cases combined with a Gaussian process to determine the most likely outbreak trajectory.

Installation

Install the stable version of the package using {drat}:

install.packages("drat")
drat:::add("epiforecasts")
install.packages("EpiNow2")

Install the development version of the package with:

remotes::install_github("epiforecasts/EpiNow2")

Windows users will need a working installation of Rtools in order to build the package from source. See here for a guide to installing Rtools for use with Stan (which is the statistical modeling platform used for the underlying model). For simple deployment/development a prebuilt docker image is also available (see documentation here).

Quick start

{EpiNow} is designed to be used with a single function call or to be used in an ad-hoc fashion via individual function calls. In the following section we give an overview of the simple use case. For more on using each function see the function documentation. The core functions are: epinow, regional_epinow, estimate_infections, and forecast_infections. estimate_infections can be use on its own to infer the underlying infection case curve from reported cases with Rt optionally returned (on by default). Estimating the underlying infection case curve alone is substantially less computationally demanding than also estimating Rt.

Reporting delays, incubation period and generation time

Distributions can either be fitted using package functionality or determined elsewhere and then defined with uncertainty for use in {EpiNow2}. When data is supplied a subsampled bootstrapped lognormal will be fit (to account for uncertainty in the observed data without being biased by changes in incidence). An arbitrary number of delay distributions are supported with the most common use case likely to be a incubation period followed by a reporting delay.

reporting_delay <- EpiNow2::bootstrapped_dist_fit(rlnorm(100, log(6), 1))
## Set max allowed delay to 30 days to truncate computation
reporting_delay$max <- 30

reporting_delay
#> $mean
#> [1] 1.698169
#> 
#> $mean_sd
#> [1] 0.1546306
#> 
#> $sd
#> [1] 1.114934
#> 
#> $sd_sd
#> [1] 0.1117431
#> 
#> $max
#> [1] 30

Here we define the incubation period and generation time based on literature estimates for Covid-19 (see here for the code that generates these estimates).

generation_time <- list(mean = EpiNow2::covid_generation_times[1, ]$mean,
                        mean_sd = EpiNow2::covid_generation_times[1, ]$mean_sd,
                         sd = EpiNow2::covid_generation_times[1, ]$sd,
                         sd_sd = EpiNow2::covid_generation_times[1, ]$sd_sd,
                         max = 30)

incubation_period <- list(mean = EpiNow2::covid_incubation_period[1, ]$mean,
                          mean_sd = EpiNow2::covid_incubation_period[1, ]$mean_sd,
                          sd = EpiNow2::covid_incubation_period[1, ]$sd,
                          sd_sd = EpiNow2::covid_incubation_period[1, ]$sd_sd,
                          max = 30)

This function represents the core functionality of the package and includes results reporting, plotting and optional saving. It requires a data frame of cases by date of report and the distributions defined above. An additional forecasting module is supported via EpiSoon and companion packages (see documentation for an example).

Load example case data from {EpiNow2}.

reported_cases <- EpiNow2::example_confirmed[1:50]

head(reported_cases)
#>          date confirm
#> 1: 2020-02-22      14
#> 2: 2020-02-23      62
#> 3: 2020-02-24      53
#> 4: 2020-02-25      97
#> 5: 2020-02-26      93
#> 6: 2020-02-27      78

Estimate cases by date of infection, the time-varying reproduction number, the rate of growth and forecast these estimates into the future by 7 days. Summarise the posterior and return a summary table and plots for reporting purposes. If a target_folder is supplied results can be internally saved (with the option to also turn off explicit returning of results). Note that for real use cases more samples and a longer warm up may be needed.

estimates <- EpiNow2::epinow(reported_cases = reported_cases, 
                             generation_time = generation_time,
                             delays = list(incubation_period, reporting_delay),
                             horizon = 7, samples = 1000, warmup = 200, 
                             cores = 4, chains = 4, verbose = TRUE, 
                             adapt_delta = 0.95)
#> [[1]]
#> Stan model 'estimate_infections' does not contain samples.
#> 
#> [[2]]
#> Stan model 'estimate_infections' does not contain samples.

names(estimates)
#> [1] "estimates"                "estimated_reported_cases"
#> [3] "summary"                  "plots"

Both summary measures and posterior samples are returned for all parameters in an easily explored format.

estimates$estimates
#> $samples
#>                 variable          parameter time       date sample    value
#>      1:       infections imputed_infections    1 2020-02-07      1    3.000
#>      2:       infections imputed_infections    2 2020-02-08      1   16.000
#>      3:       infections imputed_infections    3 2020-02-09      1   22.000
#>      4:       infections imputed_infections    4 2020-02-10      1   37.000
#>      5:       infections imputed_infections    5 2020-02-11      1   41.000
#>     ---                                                                    
#> 128068: prior_infections   prior_infections   68 2020-04-14      1 2581.666
#> 128069: prior_infections   prior_infections   69 2020-04-15      1 2517.820
#> 128070: prior_infections   prior_infections   70 2020-04-16      1 2455.553
#> 128071: prior_infections   prior_infections   71 2020-04-17      1 2394.826
#> 128072: prior_infections   prior_infections   72 2020-04-18      1 2335.601
#>         strat     type
#>      1:  <NA> estimate
#>      2:  <NA> estimate
#>      3:  <NA> estimate
#>      4:  <NA> estimate
#>      5:  <NA> estimate
#>     ---               
#> 128068:  <NA> forecast
#> 128069:  <NA> forecast
#> 128070:  <NA> forecast
#> 128071:  <NA> forecast
#> 128072:  <NA> forecast
#> 
#> $summarised
#>            date       variable strat     type      bottom         top
#>   1: 2020-02-22              R  <NA> estimate   0.7564282    1.414028
#>   2: 2020-02-23              R  <NA> estimate   0.8846257    1.448911
#>   3: 2020-02-24              R  <NA> estimate   0.9631228    1.437619
#>   4: 2020-02-25              R  <NA> estimate   1.0954422    1.468108
#>   5: 2020-02-26              R  <NA> estimate   1.2025306    1.512359
#>  ---                                                                 
#> 324: 2020-04-14 reported_cases  <NA> forecast 374.0000000 6874.000000
#> 325: 2020-04-15 reported_cases  <NA> forecast 142.0000000 5239.000000
#> 326: 2020-04-16 reported_cases  <NA> forecast 361.0000000 6498.000000
#> 327: 2020-04-17 reported_cases  <NA> forecast 458.0000000 9039.000000
#> 328: 2020-04-18 reported_cases  <NA> forecast 204.0000000 7451.000000
#>             lower       upper      median        mean           sd
#>   1:    0.9459273    1.234976    1.105647    1.113589 2.057302e-01
#>   2:    0.9905997    1.236385    1.162262    1.166212 1.754555e-01
#>   3:    1.1198978    1.322856    1.228027    1.226912 1.457745e-01
#>   4:    1.2053574    1.371572    1.295255    1.294681 1.183796e-01
#>   5:    1.3104278    1.437038    1.368334    1.367978 9.674816e-02
#>  ---                                                              
#> 324: 1075.0000000 2992.000000 2746.500000 3550.534000 3.046264e+03
#> 325: 1057.0000000 2669.000000 2226.000000 2855.966000 2.525423e+03
#> 326:  653.0000000 2748.000000 2574.500000 3628.484000 4.226875e+03
#> 327: 1045.0000000 3289.000000 2876.000000 4804.820000 8.242916e+03
#> 328:  917.0000000 2695.000000 2378.500000 4145.540000 8.458841e+03

Reported cases are returned separately in order to ease reporting of forecasts and model evaluation.

estimates$estimated_reported_cases
#> $samples
#>              date sample cases  type
#>     1: 2020-02-22      1   119 gp_rt
#>     2: 2020-02-23      1    97 gp_rt
#>     3: 2020-02-24      1    73 gp_rt
#>     4: 2020-02-25      1   331 gp_rt
#>     5: 2020-02-26      1   210 gp_rt
#>    ---                              
#> 28496: 2020-04-14    500  3076 gp_rt
#> 28497: 2020-04-15    500  1456 gp_rt
#> 28498: 2020-04-16    500  3279 gp_rt
#> 28499: 2020-04-17    500  5650 gp_rt
#> 28500: 2020-04-18    500  2047 gp_rt
#> 
#> $summarised
#>           date  type bottom   top lower upper median     mean         sd
#>  1: 2020-02-22 gp_rt     31   230    66   140  121.0  135.552   69.91704
#>  2: 2020-02-23 gp_rt     43   306    83   188  166.5  174.868   86.47373
#>  3: 2020-02-24 gp_rt     47   357   117   235  185.0  206.276  112.55849
#>  4: 2020-02-25 gp_rt     36   363   113   229  194.0  215.566  113.04736
#>  5: 2020-02-26 gp_rt     60   354   106   215  186.5  202.662   97.72685
#>  6: 2020-02-27 gp_rt     50   446   115   263  229.5  260.560  136.64177
#>  7: 2020-02-28 gp_rt     93   632   159   366  326.5  357.236  184.91553
#>  8: 2020-02-29 gp_rt     93   570   157   332  289.0  329.884  179.01133
#>  9: 2020-03-01 gp_rt     99   692   184   407  356.5  398.796  206.27769
#> 10: 2020-03-02 gp_rt    129   765   184   410  384.0  431.012  218.11775
#> 11: 2020-03-03 gp_rt    100   777   177   433  401.5  440.188  228.74989
#> 12: 2020-03-04 gp_rt     82   711   231   483  383.5  428.928  232.34748
#> 13: 2020-03-05 gp_rt    110  1073   256   638  557.5  627.182  337.31880
#> 14: 2020-03-06 gp_rt    221  1420   463   946  773.0  862.396  454.21571
#> 15: 2020-03-07 gp_rt    172  1495   383   883  748.5  852.600  458.96804
#> 16: 2020-03-08 gp_rt    208  2034   539  1208 1051.0 1170.814  633.95379
#> 17: 2020-03-09 gp_rt    320  2356   554  1357 1224.5 1371.076  748.93441
#> 18: 2020-03-10 gp_rt    389  2480   732  1567 1292.0 1393.288  695.64383
#> 19: 2020-03-11 gp_rt    320  2524   572  1484 1274.5 1452.322  800.00725
#> 20: 2020-03-12 gp_rt    616  3651  1010  2093 1828.0 2071.244 1078.45410
#> 21: 2020-03-13 gp_rt    614  5053  1234  2818 2522.5 2790.782 1596.56347
#> 22: 2020-03-14 gp_rt    527  4576  1396  2896 2467.0 2762.042 1438.36390
#> 23: 2020-03-15 gp_rt    871  5761  1820  3758 3133.5 3397.316 1670.04978
#> 24: 2020-03-16 gp_rt    858  6667  1576  3934 3475.0 3845.562 2081.31864
#> 25: 2020-03-17 gp_rt    706  6464  1596  3892 3478.0 3787.802 1993.28062
#> 26: 2020-03-18 gp_rt    778  6016  1930  3931 3289.0 3619.738 1999.37828
#> 27: 2020-03-19 gp_rt   1077  8256  2289  4937 4260.0 4733.790 2656.64362
#> 28: 2020-03-20 gp_rt   1540 10942  2777  6104 5412.0 6057.274 3536.53122
#> 29: 2020-03-21 gp_rt   1546  8833  2626  5686 4685.5 5106.524 2473.41052
#> 30: 2020-03-22 gp_rt   1587 10402  3436  7084 5589.5 6105.698 3008.77807
#> 31: 2020-03-23 gp_rt   1957 10848  2963  6316 5578.0 6159.720 3310.74146
#> 32: 2020-03-24 gp_rt   1407  9299  3098  6177 4864.5 5237.604 2629.14262
#> 33: 2020-03-25 gp_rt   1180  7753  2655  5223 4071.5 4531.636 2244.29211
#> 34: 2020-03-26 gp_rt   1459  9687  2813  6137 5298.5 5660.282 2881.43164
#> 35: 2020-03-27 gp_rt   1124 11926  3790  7785 6376.5 7062.978 3782.16643
#> 36: 2020-03-28 gp_rt   1332 10161  2750  5885 5126.0 5791.106 3224.55387
#> 37: 2020-03-29 gp_rt   1302 10634  2822  6548 5618.0 6133.976 3216.79496
#> 38: 2020-03-30 gp_rt   1824 11113  3573  6901 5301.0 6030.932 3247.57796
#> 39: 2020-03-31 gp_rt   1185  8284  2815  5750 4813.5 5122.896 2435.69280
#> 40: 2020-04-01 gp_rt   1093  7339  2114  4408 3797.5 4226.644 2152.01092
#> 41: 2020-04-02 gp_rt   1307  8866  2482  5361 4540.0 4962.260 2585.04416
#> 42: 2020-04-03 gp_rt   1681 10833  2461  6030 5394.0 5926.578 3038.09044
#> 43: 2020-04-04 gp_rt    895  8175  3056  5857 4380.0 4790.108 2378.61451
#> 44: 2020-04-05 gp_rt   1023  8775  2559  5537 4597.0 5136.340 2794.30555
#> 45: 2020-04-06 gp_rt   1587  8493  1777  4428 4268.0 4857.798 2600.59912
#> 46: 2020-04-07 gp_rt   1023  7298  1742  4217 3668.0 4046.914 2143.49553
#> 47: 2020-04-08 gp_rt    703  5439  1656  3505 3005.5 3254.126 1764.74083
#> 48: 2020-04-09 gp_rt    832  6896  1780  4215 3491.5 3939.996 2226.82248
#> 49: 2020-04-10 gp_rt    564  8140  2065  4793 4016.0 4574.440 2668.13093
#> 50: 2020-04-11 gp_rt    774  6444  1485  3709 3292.5 3714.246 2140.26301
#> 51: 2020-04-12 gp_rt   1033  8003  1624  4166 3636.5 4252.412 2907.12434
#> 52: 2020-04-13 gp_rt    314  7614  1285  3641 3252.5 3945.210 2990.48684
#> 53: 2020-04-14 gp_rt    374  6874  1075  2992 2746.5 3550.534 3046.26356
#> 54: 2020-04-15 gp_rt    142  5239  1057  2669 2226.0 2855.966 2525.42336
#> 55: 2020-04-16 gp_rt    361  6498   653  2748 2574.5 3628.484 4226.87475
#> 56: 2020-04-17 gp_rt    458  9039  1045  3289 2876.0 4804.820 8242.91616
#> 57: 2020-04-18 gp_rt    204  7451   917  2695 2378.5 4145.540 8458.84071
#>           date  type bottom   top lower upper median     mean         sd

A summary table is returned for rapidly understanding the results and for reporting purposes.

estimates$summary
#>                                  measure              estimate numeric_estimate
#> 1: New confirmed cases by infection date     2048 (19 -- 7867)     <data.table>
#> 2:        Expected change in daily cases                Unsure             0.72
#> 3:            Effective reproduction no.      0.8 (0.2 -- 1.4)     <data.table>
#> 4:                        Rate of growth -0.06 (-0.26 -- 0.11)     <data.table>
#> 5:          Doubling/halving time (days)   -10.9 (6.4 -- -2.7)     <data.table>

A range of plots are returned (with the single summary plot shown below).

estimates$plots$summary

This function runs the the epinow function across multiple regions in an efficient manner.

Define cases in multiple regions delineated by the region variable.

reported_cases <- data.table::rbindlist(list(
   data.table::copy(reported_cases)[, region := "testland"],
   reported_cases[, region := "realland"]))

head(reported_cases)
#>          date confirm   region
#> 1: 2020-02-22      14 testland
#> 2: 2020-02-23      62 testland
#> 3: 2020-02-24      53 testland
#> 4: 2020-02-25      97 testland
#> 5: 2020-02-26      93 testland
#> 6: 2020-02-27      78 testland

Run the pipeline on each region in turn. The commented code (requires the {future} package) can be used to run regions in parallel (when in most scenarios cores should be set to 1).

## future::plan("multisession")
estimates <- EpiNow2::regional_epinow(reported_cases = reported_cases, 
                                      generation_time = generation_time,
                                      delays = list(incubation_period, reporting_delay),
                                      horizon = 7, samples = 1000, warmup = 200,
                                      cores = 4, chains = 4, adapt_delta = 0.95,
                                      verbose = TRUE)
#> [[1]]
#> Stan model 'estimate_infections' does not contain samples.
#> 
#> [[1]]
#> Stan model 'estimate_infections' does not contain samples.

Results from each region are stored in a regional list with across region summary measures and plots stored in a summary list. All results can be set to be internally saved by setting the target_folder and summary_dir arguments.

Summary measures that are returned include a table formatted for reporting (along with raw results for further processing).

estimates$summary$summarised_results$table
#>      Region New confirmed cases by infection date
#> 1: realland                     2012 (26 -- 8451)
#> 2: testland                     2190 (13 -- 9426)
#>    Expected change in daily cases Effective reproduction no.
#> 1:                         Unsure           0.8 (0.3 -- 1.5)
#> 2:                         Unsure           0.8 (0.2 -- 1.6)
#>           Rate of growth Doubling/halving time (days)
#> 1: -0.06 (-0.23 -- 0.13)            -11.3 (5.2 -- -3)
#> 2: -0.06 (-0.24 -- 0.16)          -12.3 (4.4 -- -2.9)

A range of plots are again returned (with the single summary plot shown below).

estimates$summary$summary_plot

Reporting templates

Rmarkdown templates are provided in the package (templates) for semi-automated reporting of estimates. These are currently undocumented but an example integration can be seen here. If using these templates to report your results please highlight our limitations as these are key to understanding our results.

Contributing

File an issue here if you have identified an issue with the package. Please note that due to operational constraints priority will be given to users informing government policy or offering methodological insights. We welcome all contributions, in particular those that improve the approach or the robustness of the code base.

About

Estimate Realtime Case Counts and Time-varying Epidemiological Parameters

https://epiforecasts.io/EpiNow2/

License:Other


Languages

Language:C++ 52.9%Language:R 43.1%Language:Stan 3.9%Language:Dockerfile 0.1%