Bootstrapping and finding peaks
We provide functions to return the peak of the incidence data (grouped or ungrouped), bootstrap from the incidence data, and estimate confidence intervals around a peak.
bootstrap()
dat <- fluH7N9_china_2013
x <- incidence(dat, date_index = "date_of_onset", groups = "gender")
bootstrap(x)
#> # incidence: 67 x 4
#> # count vars: date_of_onset
#> # groups: gender
#> date_index gender count_variable count
#> * <date> <fct> <chr> <int>
#> 1 2013-02-19 m date_of_onset 1
#> 2 2013-02-27 m date_of_onset 0
#> 3 2013-03-07 m date_of_onset 1
#> 4 2013-03-08 m date_of_onset 0
#> 5 2013-03-09 f date_of_onset 2
#> 6 2013-03-13 f date_of_onset 1
#> 7 2013-03-17 m date_of_onset 0
#> 8 2013-03-19 f date_of_onset 3
#> 9 2013-03-20 f date_of_onset 2
#> 10 2013-03-20 m date_of_onset 1
#> # … with 57 more rows
find_peak()
dat <- fluH7N9_china_2013
x <- incidence(dat, date_index = "date_of_onset", groups = "gender")
# peaks across each group
find_peak(x)
#> # incidence: 2 x 4
#> # count vars: date_of_onset
#> # groups: gender
#> date_index gender count_variable count
#> * <date> <fct> <chr> <int>
#> 1 2013-04-11 f date_of_onset 3
#> 2 2013-04-03 m date_of_onset 6
# peak without groupings
find_peak(regroup(x))
#> # incidence: 1 x 3
#> # count vars: date_of_onset
#> date_index count_variable count
#> * <date> <chr> <int>
#> 1 2013-04-03 date_of_onset 7
estimate_peak()
Note that the bootstrapping approach used for estimating the peak time makes the following assumptions:
- the total number of event is known (no uncertainty on total incidence);
- dates with no events (zero incidence) will never be in bootstrapped datasets; and
- the reporting is assumed to be constant over time, i.e. every case is equally likely to be reported.
dat <- fluH7N9_china_2013
x <- incidence(dat, date_index = "date_of_onset", groups = "province")
# regrouping for overall peak (we suspend progress bar for markdown)
estimate_peak(regroup(x), progress = FALSE)
#> # A data frame: 1 × 7
#> count_variable observed_peak observ…¹ boots…² lower_ci median upper_ci
#> <chr> <date> <int> <list> <date> <date> <date>
#> 1 date_of_onset 2013-04-03 7 <df> 2013-03-29 2013-04-06 2013-04-14
#> # … with abbreviated variable names ¹observed_count, ²bootstrap_peaks
# across provinces
estimate_peak(x, progress = FALSE)
#> # A data frame: 13 × 8
#> province count…¹ observed…² obser…³ boots…⁴ lower_ci median upper_ci
#> <fct> <chr> <date> <int> <list> <date> <date> <date>
#> 1 Anhui date_o… 2013-03-09 1 <df> 2013-03-09 2013-03-28 2013-04-14
#> 2 Beijing date_o… 2013-04-11 1 <df> 2013-02-19 2013-04-11 2013-05-21
#> 3 Fujian date_o… 2013-04-17 1 <df> 2013-04-17 2013-04-18 2013-04-29
#> 4 Guangdong date_o… 2013-07-27 1 <df> 2013-02-19 2013-07-27 2013-07-27
#> 5 Hebei date_o… 2013-07-10 1 <df> 2013-02-19 2013-07-10 2013-07-10
#> 6 Henan date_o… 2013-04-06 1 <df> 2013-02-19 2013-04-06 2013-04-17
#> 7 Hunan date_o… 2013-04-14 1 <df> 2013-02-19 2013-04-14 2013-04-23
#> 8 Jiangsu date_o… 2013-03-19 2 <df> 2013-03-08 2013-03-20 2013-04-19
#> 9 Jiangxi date_o… 2013-04-15 1 <df> 2013-04-15 2013-04-19 2013-05-03
#> 10 Shandong date_o… 2013-04-16 1 <df> 2013-02-19 2013-04-16 2013-04-27
#> 11 Shanghai date_o… 2013-04-01 4 <df> 2013-02-27 2013-04-01 2013-04-04
#> 12 Taiwan date_o… 2013-04-12 1 <df> 2013-02-19 2013-04-12 2013-04-12
#> 13 Zhejiang date_o… 2013-04-06 5 <df> 2013-03-29 2013-04-10 2013-04-15
#> # … with abbreviated variable names ¹count_variable, ²observed_peak,
#> # ³observed_count, ⁴bootstrap_peaks