Skip to contents

This function can be used to estimate the peak of an epidemic curve using bootstrapped samples of the available data.

Usage

estimate_peak(x, n = 100L, alpha = 0.05, first_only = TRUE, progress = TRUE)

Arguments

x

An incidence2 object.

n

integer.

The number of bootstrap datasets to be generated; defaults to 100.

[double] vectors will be converted via as.integer(n).

alpha

numeric.

The type 1 error chosen for the confidence interval; defaults to 0.05.

first_only

bool.

Should only the first peak (by date) be kept.

Defaults to TRUE.

progress

bool.

Should a progress bar be displayed (default = TRUE)

Value

A data frame with the the following columns:

  • observed_date: the date of peak incidence of the original dataset.

  • observed_count: the peak incidence of the original dataset.

  • estimated: the median peak time of the bootstrap datasets.

  • lower_ci/upper_ci: the confidence interval based on bootstrap datasets.

  • bootstrap_peaks: a nested tibble containing the the peak times of the bootstrapped datasets.

Details

Input dates are resampled with replacement to form bootstrapped datasets; the peak is reported for each, resulting in a distribution of peak times. When there are ties for peak incidence, only the first date is reported.

Note that the bootstrapping approach used for estimating the peak time makes the following assumptions:

  • the total number of event is known (no uncertainty on total incidence)

  • dates with no events (zero incidence) will never be in bootstrapped datasets

  • the reporting is assumed to be constant over time, i.e. every case is equally likely to be reported

See also

bootstrap_incidence() for the bootstrapping underlying this approach and keep_peaks() to get the peaks in a single incidence2 object.

Author

Thibaut Jombart and Tim Taylor, with inputs on caveats from Michael Höhle.

Examples

if (requireNamespace("outbreaks", quietly = TRUE)) {

  # load data and create incidence
  data(fluH7N9_china_2013, package = "outbreaks")
  i <- incidence(fluH7N9_china_2013, date_index = "date_of_onset")

  # find 95% CI for peak time using bootstrap
  estimate_peak(i)
}
#> # A tibble: 1 × 7
#>   count_variable observed_peak observed_count bootstrap_peaks lower_ci  
#>   <chr>          <date>                 <int> <list>          <date>    
#> 1 date_of_onset  2013-04-03                 7 <df [100 × 1]>  2013-03-29
#> # ℹ 2 more variables: median <date>, upper_ci <date>