# 5.6 Non-linear regression

Sometimes the relationship between the forecast variable and a predictor is not linear, and then the usual multiple regression equation needs modifying. In Section 4/7, we discussed using log transformations to deal with a variety of non-linear forms, and in Section 5/2 we showed how to include a piecewise-linear trend in a model; that is a nonlinear trend constructed out of linear pieces. Allowing other variables to enter in a nonlinear manner can be handled in exactly the same way.

To keep things simple, suppose we have only one predictor $x$. Then the model we use is $$y = f(x) + e, $$ where $f$ is a possibly nonlinear function of $x$. In standard (linear) regression, $f(x)=\beta_{0} + \beta_{1} x$, but in nonlinear regression, we allow $f$ to be a nonlinear function of $x$.

One of the simplest ways to do nonlinear regression is to make $f$ piecewise linear. That is, we introduce points where the slope of $f$ can change. These points are called “knots”.

## Example 5.1 Car emissions continued

In Chapter 4, we considered an example of forecasting the carbon footprint of a car from its city-based fuel economy. Our previous analysis (Section 4/7) showed that this relationship was nonlinear. Close inspection of Figure 4.3 suggests that a change in slope occurs at about 25mpg. This can be achieved using the following variables: $x$ (the City mpg) and

The resulting fitted values are shown as the red line below.

fit2 <- lm(Carbon ~ City + Cityp, data=fuel)

x <- 15:50; z <- pmax(x-25,0)

fcast2 <- forecast(fit2, newdata=data.frame(City=x,Cityp=z))

plot(jitter(Carbon) ~ jitter(City), data=fuel)

lines(x, fcast2$mean,col="red")

## Regression splines

Piecewise linear relationships constructed in this way are a special
case of *regression splines*. In general, a linear regression spline is
obtained using

where $c_{1},\dots,c_{k-1}$ are the knots (the points at which the line can bend). Selecting the number of knots ($k-1$) and where they should be positioned can be difficult and somewhat arbitrary. Automatic knot selection algorithms are available in some software, but are not yet widely used.

A smoother result is obtained using piecewise cubics rather than piecewise lines. These are constrained so they are continuous (they join up) and they are smooth (so there are no sudden changes of direction as we see with piecewise linear splines). In general, a cubic regression spline is written as

An example of a cubic regression spline fitted to the fuel economy data is shown below with a single knot at $c_{1}=25$.

fcast3 <- forecast(fit3,newdata=data.frame(City=x,Cityp=z))

plot(jitter(Carbon) ~ jitter(City), data=fuel)

lines(x, fcast3$mean,col="red")

This usually gives a better fit to the data, although forecasting values of Carbon when City is outside the range of the historical data becomes very unreliable.