8.8 Forecasting

Point forecasts

Although we have calculated forecasts from the ARIMA models in our examples, we have not yet explained how they are obtained. Point forecasts can be calculated using the following three steps.

1. Expand the ARIMA equation so that $y_t$ is on the left hand side and all other terms are on the right.
2. Rewrite the equation by replacing $t$ by $T+h$.
3. On the right hand side of the equation, replace future observations by their forecasts, future errors by zero, and past errors by the corresponding residuals.

Beginning with $h=1$, these steps are then repeated for $h=2,3,\dots$ until all forecasts have been calculated.

The procedure is most easily understood via an example. We will illustrate it using the ARIMA(3,1,1) model fitted in the previous section. The model can be written as follows

$$(1-\phi_1B -\phi_2B^2-\phi_3B^3)(1-B) y_t = (1+\theta_1B)e_{t},$$

where $\phi_1=0.0519$, $\phi_2=0.1191$, $\phi_3=0.3730$ and $\theta_1=-0.4542$.

Then we expand the left hand side to obtain

$$\left[1-(1+\phi_1)B +(\phi_1-\phi_2)B^2 + (\phi_2-\phi_3)B^3 +\phi_3B^4\right] y_t = (1+\theta_1B)e_{t},$$

and applying the backshift operator gives

$$y_t - (1+\phi_1)y_{t-1} +(\phi_1-\phi_2)y_{t-2} + (\phi_2-\phi_3)y_{t-3} +\phi_3y_{t-4} = e_t+\theta_1e_{t-1}.$$

Finally, we move all terms other than $y_t$ to the right hand side:

$$\tag{8.2}\label{arima301f} y_t = (1+\phi_1)y_{t-1} -(\phi_1-\phi_2)y_{t-2} - (\phi_2-\phi_3)y_{t-3} -\phi_3y_{t-4} + e_t+\theta_1e_{t-1}.\qquad$$

This completes the first step. While the equation now looks like an ARIMA(4,0,1), it is still the same ARIMA(3,1,1) model we started with. It cannot be considered an ARIMA(4,0,1) because the coefficients do not satisfy the stationarity conditions.

For the second step, we replace $t$ by $T+1$ in (\ref{arima301f}):

$$y_{T+1} = (1+\phi_1)y_{T} -(\phi_1-\phi_2)y_{T-1} - (\phi_2-\phi_3)y_{T-2} -\phi_3y_{T-3} + e_{T+1}+\theta_1e_{T}.$$

Assuming we have observations up to time $T$, all values on the right hand side are known except for $e_{T+1}$ which we replace by zero and $e_T$ which we replace by the last observed residual $\hat{e}_T$:

$$\hat{y}_{T+1|T} = (1+\phi_1)y_{T} -(\phi_1-\phi_2)y_{T-1} - (\phi_2-\phi_3)y_{T-2} -\phi_3y_{T-3} + \theta_1\hat{e}_{T}.$$

A forecast of $y_{T+2}$ is obtained by replacing $t$ by $T+2$ in (\ref{arima301f}). All values on the right hand side will be known at time $T$ except $y_{T+1}$ which we replace by $\hat{y}_{T+1|T}$, and $e_{T+2}$ and $e_{T+1}$, both of which we replace by zero:

$$\hat{y}_{T+2|T} = (1+\phi_1)\hat{y}_{T+1|T} -(\phi_1-\phi_2)y_{T} - (\phi_2-\phi_3)y_{T-1} -\phi_3y_{T-2}.$$

The process continues in this manner for all future time periods. In this way, any number of point forecasts can be obtained.

Forecast intervals

The calculation of ARIMA forecast intervals is much more difficult, and the details are largely beyond the scope of this book. We will just give some simple examples.

The first forecast interval is easily calculated. If $\hat{\sigma}$ is the standard deviation of the residuals, then a 95% forecast interval is given by $\hat{y}_{T+1|T} \pm 1.96\hat{\sigma}$. This result is true for all ARIMA models regardless of their parameters and orders.

Multi-step forecast intervals for ARIMA($0,0,q$) models are relatively easy to calculate. We can write the model as

$$y_t = e_t + \sum_{i=1}^q \theta_i e_{t-i}.$$

Then the estimated forecast variance can be written as

$$v_{T+h|T} = \hat{\sigma}^2 \left[ 1 + \sum_{i=1}^{h-1} \theta_i^2\right], \qquad\text{for h=2,3,\dots,}$$

and a 95% forecast interval is given by $\hat{y}_{T+h|T} \pm 1.96\sqrt{v_{T+h|T}}$.

In Section 8/4, we showed that an AR(1) model can be written as an MA($\infty$) model. Using this equivalence, the above result for MA($q$) models can also be used to obtain forecast intervals for AR(1) models.

More general results, and other special cases of multi-step forecast intervals for an ARIMA($p,d,q$) model, are given in more advanced textbooks such as Brockwell and Davis (1991, Section 9.5).

The forecast intervals for ARIMA models are based on assumptions that the residuals are uncorrelated and normally distributed. If either of these are assumptions do not hold, then the forecast intervals may be incorrect. For this reason, always plot the ACF and histogram of the residuals to check the assumptions before producing forecast intervals.

In general, forecast intervals from ARIMA models will increase as the forecast horizon increases. For stationary models (i.e., with $d=0$), they will converge so forecast intervals for long horizons are all essentially the same. For $d>1$, the forecast intervals will continue to grow into the future.

As with most forecast interval calculations, ARIMA-based intervals tend to be too narrow. This occurs because only the variation in the errors has been accounted for. There is also variation in the parameter estimates, and in the model order, that has not been included in the calculation. In addition, the calculation assumes that the historical patterns that have been modelled will continue into the forecast period.