8 Forecast evaluation
Where possible, the accuracy evaluation should be handled by existing tidymodels tools such as yardstick. It is likely that some changes or extensions will be needed for full support of time series accuracy metrics.
The forecast package implements accuracy as a function which is applied to a model. Out of sample accuracy can be computed by additionally providing a test set.
It is probably more transparent to compute accuracy metrics by directly providing actual response values and model predictions.
8.2 Model vs data centric
forecast is model centric
# forecast accuracy(f = forecast, x = new_ts)
yardstick is data centric https://github.com/r-lib/generics/pull/22
# yardstick fit_tbl %>% accuracy(col1, col2)
8.3.1 Desirable functionality
accuracy() should provide a basic set of measures of fit for both models (
mdl_df) and forecasts (
fbl_ts), similarly to the
forecast package (perhaps only MAE, RMSE/MSE, and MAPE by default).
It should be sufficiently flexible to support analysts in calculating a wide variety of accuracy measures, including:
- Point forecast accuracy measures
- Interval accuracy measures
- Distribution accuracy measures
- User specified accuracy measures
The user should be able to specify which measures they wish to compute, including measures exported by
fablelite, measures from extension packages, and user specified measures.
8.3.2 Proposed user interface
The accuracy measures to be calculated can be specified as a list of accuracy measure functions as the
measures argument. This input will also be flattened, allowing groups of accuracy measures to be defined.
... is used to provide additional arguments that will be applied to all accuracy measures (where supported).
For models (
mdl_df), no additional inputs are required:
mbl %>% accuracy( measures = list(MASE, MAE, ME), ... )
For forecasts (
fbl_ts), the test set must be provided. Additionally, the dataset used for model training can be provided (interface still under consideration) to extend the inputs (required for MASE):
mbl %>% accuracy( new_data, measures = list(MASE, MAE, ME), training_data = NULL ... )
8.3.3 Implementation details
To achieve this, accuracy measure functions can expect a set of basic inputs from
accuracy(). The measures that are required for computation should be used as formals for the function. These inputs include (list is not yet comprehensive and will be added to):
- .resid: A vector of residuals from either the training (model accuracy) or test (forecast accuracy) data.
- .resp: A vector of responses matching the residuals (for forecast accuracy, the original data must be provided).
- .fitted: The fitted values from the model, or forecasted values from the forecast.
- .dist: The distribution of fitted values from the model, or forecasted values from the forecast.
- .period: The seasonal period of the data (defaulting to ‘smallest’ seasonal period).
- .expr_resp: An expression for the response variable.
If a method allows more inputs than this, such as demeaning for MASE, these additional arguments are provided in the dots of the accuracy function.
8.4 Cross validation
CV(tsbl, mdl, h, window_type, ...)