2 Tidy time series
2.1 ts
Time series data structures in R vary substantially, however most time series models make use of the ts
object structure from the stats
package. This object concisely stores the time series index using three ‘time series parameters’ (tsp
): start
, frequency
, and end
. For most time series tools (such as arima
, ets
, stl
) this structural information is sufficient, however it lacks details that are present in modern time series datasets:
- Multiple seasonality
- Irregular observations
- Exogenous information
- Many time series (that differ in length)
In many senses this structure is limited, and inconsistent with the tidy data principles:
- Non-rectangular index structure
- Wide-format for keyed data (
mts
) - Unnatural index format for importing data
- Difficulty working with tidyverse tools
2.2 tsibble
The tsibble package by Earo Wang provides a tidy data structure for time series, and is well described in her introductory vignette.
This data structure is sufficiently flexible to support the future of time series modelling tools (such as tbats
, fasster
and prophet
). Beyond the data tidying and transformation tools that the package provides, the object also includes valuable structural information (index
and key
) for time series modelling.
2.2.1 index
The index is essential for modelling as it can be used to identify the frequency and regularity of the observations. By storing a standard datetime
object within the dataset, it makes irregular time series modelling possible. It also allows a more flexible specification of seasonal frequency (see seasonal period) that is easier to specify for the end user (a very common difficulty when constructing ts
objects).
2.2.2 key
Keys are used within tsibble to uniquely identify related time series in a tidy structure. They are also useful for identifying relational structures between each time series. This is especially useful for forecast reconciliation, where a hierarchical or grouped structure is imposed on a set of forecasts to impose relational constraints (typically aggregation). Keys within tsibble can be either nested (hierarchical) or crossed (grouped), and can be directly used to reconcile forecasts. This structure also has purpose for univariate models, as it allows batch forecasting to be applied across many time series.