evaluate_resampling()
uses repeated K-fold cross-validation and
the Root Mean Square Error (RMSE) of testing sets to measure the predictive
power of a single model. Methods are provided for
trending::trending_model
(and lists of these) objects.
evaluate_resampling(x, ...)
# S3 method for default
evaluate_resampling(x, ...)
# S3 method for trending_model
evaluate_resampling(
x,
data,
metric = c("rmse", "rsq", "mae"),
metric_arguments = list(na.rm = TRUE),
v = nrow(data),
repeats = 1,
...
)
# S3 method for list
evaluate_resampling(
x,
data,
metric = c("rmse", "rsq", "mae"),
metric_arguments = list(na.rm = TRUE),
v = nrow(data),
repeats = 1,
...
)
An R object.
Not currently used.
a data.frame
containing data (including the response variable
and all predictors) used in the specified model.
One of "rmse" (see calculate_rmse), "mae" (see calculate_mae) and "rsq" (see calculate_rsq).
A named list of arguments passed to the underlying functions that calculate the metrics.
the number of equally sized data partitions to be used for K-fold
cross-validation; v
cross-validations will be performed, each using v - 1
partition as training set, and the remaining partition as testing set.
Defaults to the number of row in data, so that the method uses
leave-one-out cross validation, akin to Jackknife except that the testing
set (and not the training set) is used to compute the fit statistics.
the number of times the random K-fold cross validation should be repeated for; defaults to 1; larger values are likely to yield more reliable / stable results, at the expense of computational time
These functions wrap around existing functions from several
packages. evaluate_resampling.trending_model()
and
evaluate_resampling.list()
both use rsample::vfold_cv()
for sampling
and, for the calculating the different metrics, the
yardstick package.
x <- rnorm(100, mean = 0)
y <- rpois(n = 100, lambda = exp(x + 1))
dat <- data.frame(x = x, y = y)
model <- trending::glm_model(y ~ x, poisson)
models <- list(
poisson_model = trending::glm_model(y ~ x, poisson),
linear_model = trending::lm_model(y ~ x)
)
evaluate_resampling(model, dat)
#> # A tibble: 100 × 9
#> metric result warnings errors model fitting_wa…¹ fitti…² predi…³ predi…⁴
#> * <chr> <dbl> <list> <list> <list> <list> <list> <list> <list>
#> 1 rmse 0.399 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 2 rmse 1.04 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 3 rmse 0.733 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 4 rmse 3.57 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 5 rmse 0.221 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 6 rmse 4.48 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 7 rmse 0.365 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 8 rmse 1.38 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 9 rmse 0.802 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> 10 rmse 3.14 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL> <NULL>
#> # … with 90 more rows, and abbreviated variable names
#> # ¹fitting_warnings$warnings, ²fitting_errors$errors,
#> # ³predicting_warnings$warnings, ⁴predicting_errors$errors
evaluate_resampling(models, dat)
#> # A tibble: 200 × 10
#> model_name metric result warni…¹ errors model fitti…² fitti…³ predi…⁴
#> <chr> <chr> <dbl> <list> <list> <list> <list> <list> <list>
#> 1 poisson_model rmse 0.839 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 2 poisson_model rmse 4.48 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 3 poisson_model rmse 2.29 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 4 poisson_model rmse 0.501 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 5 poisson_model rmse 1.01 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 6 poisson_model rmse 1.06 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 7 poisson_model rmse 3.14 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 8 poisson_model rmse 0.940 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 9 poisson_model rmse 0.388 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> 10 poisson_model rmse 1.26 <NULL> <NULL> <glm_trn_> <NULL> <NULL> <NULL>
#> # … with 190 more rows, 1 more variable: predicting_errors <tibble[,1]>, and
#> # abbreviated variable names ¹warnings, ²fitting_warnings$warnings,
#> # ³fitting_errors$errors, ⁴predicting_warnings$warnings