5.6 Non-linear regression

Sometimes the relationship between the forecast variable and a predictor is not linear, and then the usual multiple regression equation needs modifying. In Section 4/7, we discussed using log transformations to deal with a variety of non-linear forms, and in Section 5/2 we showed how to include a piecewise-linear trend in a model; that is a nonlinear trend constructed out of linear pieces. Allowing other variables to enter in a nonlinear manner can be handled in exactly the same way.

To keep things simple, suppose we have only one predictor $x$. Then the model we use is $$y = f(x) + e,$$ where $f$ is a possibly nonlinear function of $x$. In standard (linear) regression, $f(x)=\beta_{0} + \beta_{1} x$, but in nonlinear regression, we allow $f$ to be a nonlinear function of $x$.

One of the simplest ways to do nonlinear regression is to make $f$ piecewise linear. That is, we introduce points where the slope of $f$ can change. These points are called “knots”.

Example 5.1 Car emissions continued

In Chapter 4, we considered an example of forecasting the carbon footprint of a car from its city-based fuel economy. Our previous analysis (Section 4/7) showed that this relationship was nonlinear. Close inspection of Figure 4.3 suggests that a change in slope occurs at about 25mpg. This can be achieved using the following variables: $x$ (the City mpg) and

$$z = (x-25)_{+} = \begin{cases} 0 & \text{if} x < 25 \\ x-25 & \text{if} x\ge 25 \end{cases}$$

The resulting fitted values are shown as the red line below.

Figure 5.12: Piecewise linear trend to fuel economy data.

R code
Cityp <- pmax(fuel$City-25,0) fit2 <- lm(Carbon ~ City + Cityp, data=fuel) x <- 15:50; z <- pmax(x-25,0) fcast2 <- forecast(fit2, newdata=data.frame(City=x,Cityp=z)) plot(jitter(Carbon) ~ jitter(City), data=fuel) lines(x, fcast2$mean,col="red")
Additional bends can be included in the relationship by adding further variables of the form $(x-c)_+$ where $c$ is the “knot” or point at which the line should bend. As above, the notation $(x-c)_+$ means the value $x-c$ if it is positive and 0 otherwise.

Regression splines

Piecewise linear relationships constructed in this way are a special case of regression splines. In general, a linear regression spline is obtained using

$$x_{1}= x \quad x_{2} = (x-c_{1})_+ \quad\dots\quad x_{k} = (x-c_{k-1})_+,$$

where $c_{1},\dots,c_{k-1}$ are the knots (the points at which the line can bend). Selecting the number of knots ($k-1$) and where they should be positioned can be difficult and somewhat arbitrary. Automatic knot selection algorithms are available in some software, but are not yet widely used.

A smoother result is obtained using piecewise cubics rather than piecewise lines. These are constrained so they are continuous (they join up) and they are smooth (so there are no sudden changes of direction as we see with piecewise linear splines). In general, a cubic regression spline is written as

$$x_{1}= x \quad x_{2}=x^2 \quad x_3=x^3 \quad x_4 = (x-c_{1})^3_+ \quad\dots\quad x_{k} = (x-c_{k-3})^3_+.$$

An example of a cubic regression spline fitted to the fuel economy data is shown below with a single knot at $c_{1}=25$.

Figure 5.13: Cubic regression spline fitted to the fuel economy data.

R code
fit3 <- lm(Carbon ~ City + I(City^2) + I(City^3) + I(Cityp^3), data=fuel)
fcast3 <- forecast(fit3,newdata=data.frame(City=x,Cityp=z))
plot(jitter(Carbon) ~ jitter(City), data=fuel)
lines(x, fcast3\$mean,col="red")

This usually gives a better fit to the data, although forecasting values of Carbon when City is outside the range of the historical data becomes very unreliable.