5 Multiple regression
In multiple regression there is one variable to be forecast and several predictor variables. Throughout this chapter we will use two examples of multiple regression, one based on cross-sectional data and the other on time series data.
Example: credit scores
Banks score loan customers based on a lot of personal information. A sample of 500 customers from an Australian bank provided the following information.
|Score||Savings (\$'000)||Income (\$'000 )||Time current address (Months)||Time current job (Months)|
The credit score in the left hand column is used to determine if a customer will be given a loan or not. For these data, the score is on a scale between 0 and 100. It would save a lot of time if the credit score could be predicted from the other variables listed above. Then there would be no need to collect all the other information that banks require. Even if the credit score can only be roughly forecast using these four predictors, it might provide a way of filtering out customers that are unlikely to receive a high enough score to obtain a loan.
This is an example of cross-sectional data where we want to predict the value of the credit score variable using the values of the other variables.
Example: Australian quarterly beer production
Recall the Australian quarterly beer production data shown below.
These are time series data and we want to forecast the value of future beer production. There are no other variables available for predictors. Instead, with time series data, we use the number of quarters since the start of the series as a predictor variable. We may also use the quarter of the year corresponding to each observation as a predictor variable. Then, knowing the number of quarters since the start of the series and the specific quarter of interest, we can forecast the value of beer production in that quarter.