1.2 Six must-know basics of forecasting

Not knowing the basics may eventually lead to some serious mistakes during the forecasting process. This section discusses six must-know basics of forecasting in the context of load forecasting.

1.2.1 Forecasting is a stochastic problem

Forecasting, by nature, is a stochastic problem rather than deterministic. There is no "certain" in forecasting. Things like "the sun will rise tomorrow" are not forecasts. Since the forecasters are dealing with randomness, the output of a forecasting process is supposed to be in a probabilistic form, such as a forecast under this or that scenario, a probability density function, a prediction interval, or some quantile of interest. In practice, many decision making processes today cannot yet take probabilistic inputs, so the most commonly used forecasting output form is still point forecast, e.g., the future expected value of a random variable.

1.2.2 All forecasts are wrong

Due to the stochastic nature of forecasting, the response variable to forecast is never 100% predictable. Take guessing the head or tail of a coin as an example. The guess may be right a few times, but the probability that it is always right is zero. The question like "why is your forecast different from the actual?" should have never been asked, because we do expect some differences between the forecasts and actual values. Instead, it is questionable if the forecasts are exactly the same as actual. On the other hand, it is fair to ask "why does your forecast fail to capture XYZ features from the actual?" It requires the person who raises the question to identify the missing features first. There are plenty of other factors that may cause wrong forecasts, such as bad data, inappropriate methodologies, and crappy software, etc. It's the forecaster's job to apply best practices to avoid these avoidable issues.

1.2.3 Some forecasts are useful

Most industries require some forecasts in the decision making processes, but not necessarily the perfect forecast. The retail industry needs SKU (store keeping unit) forecasts to optimize promotional offerings and to manage inventory; the airline industry needs passenger forecasts to schedule airlines; the utility industry need load forecasts to operate and plan the power systems.

There are at least two aspects of usefulness in the utility industry: accuracy and defensibility. Accuracy can be calculated based on various peaks (i.e., monthly, seasonal or annual peaks), energy or the combination of them. Defensibility may include interpretability, traceability, and reproducibility.

While the points above may not be mutually exclusive, they should be prioritized differently depending upon the exact business need. For example, for regulatory compliance purpose, we would emphasize defensibility more than the accuracy. As a result, statistical approaches such as multiple linear regression are usually preferred over black-box approaches like Artificial Neural Networks.

1.2.4 All forecasts can be improved

Since all forecasts are wrong, there is always room for improvement, at least from the accuracy aspect. Broadly speaking, the objective of forecast improvement is to enhance the usefulness. While there are many aspects of usefulness, it can be hard to figure out what to improve. Other than the ones discussed above, such as various error metrics, interpretability, traceability and reproducibility, there are some more specific directions for potential improvement:

1) Spread of errors. Nobody likes to have surprisingly big errors. Reducing the variance or range of the errors means reducing the uncertainty, which consequently increases the usefulness of the forecasts. Sometimes the business may even give up some central tendency of the error (e.g., MAPE), to gain reduction in the spread (e.g., standard deviation of APE).

2) Interpretability of errors. For instance, in long term load forecasting, due to the uncertainty in long term weather and economy forecasts, the load forecasts may present some significant errors from time to time. Then the forecasters should help the business users to understand how much of the error is contributed by modeling error, weather forecast error and economy forecast error. Breaking down the error to its sources increases the interpretability as well as the usefulness of forecasts.

3) Requirement of resources. In reality, we always have limited resources to build a forecast. The limitations may be from data, hardware and labor. If we can enhance the simplicity of the forecasting process by reducing the requirements on these resources, it can be very useful for the business side.

1.2.5 Accuracy is never guaranteed

One way to guarantee a MAPE of 100% is to turn in a forecast with zeros at all points, but it has no value to the business. Due to the stochastic nature of forecasting, the future will never repeat the history in exactly the way described by our models. Sometimes, the deviations are large; sometimes, they are small. Even if a forecaster could maintain a stable accuracy during the past a few years, there is still no guarantee that the same or similar accuracy can be achieved going forward. Sometimes the consultants and vendors promise unrealistic accuracy to the clients in order to sell the services or solutions. This is one of the worst practices, because eventually the clients will realize that the error is not as low as what's been promised.

1.2.6 Having the second opinion is preferred

"A man with a watch knows what time it is. A man with two watches is never sure."

A forecaster may not prefer wearing two watches, but would love to have two forecasting models. There is not a perfect model. If only one model is being used, the forecaster will experience "bad" forecasts from time to time. If multiple models are available, the situation can be completely different. The forecaster will have good confidence when the models agree with each other. S/he will be able to focus on the periods when these models disagree with each other significantly. Empirically, combining forecasting techniques usually does a better job than each individual by offering more robust and accurate forecasts. Therefore, one of the best practices is to run multiple models and combine the forecasts.