4.7 Non-linear functional forms

Although the linear relationship assumed at the beginning of this chapter is often adequate, there are cases for which a non-linear functional form is more suitable. The scatter plot of Carbon versus City in Figure 4.3 is an example where a non-linear functional form is required.

Simply transforming variables $y$ and/or $x$ and then estimating a regression model using the transformed variables is the simplest way of obtaining a non-linear specification. The most commonly used transformation is the (natural) logarithmic (see Section 2/4). Recall that in order to perform a logarithmic transformation to a variable, all its observed values must be greater than zero.

A log-log functional form is specified as $$\log y_i=\beta_0+\beta_1 \log x_i + \varepsilon_i. $$ In this model, the slope $\beta_1$ can be interpeted as an elasticity: $\beta_1$ is the average percentage change in $y$ resulting from a $1\%$ change in $x$.

Figure 4.7 shows a scatter plot of Carbon versus City and the fitted log-log model in both the original scale and the logarithmic scale. The plot shows that in the original scale the slope of the fitted regression line using a log-log functional form is non-constant. The slope depends on $x$ and can be calculated for each point (see Table 4.1). In the logarithmic scale the slope of the line which is now interpreted as an elasticity is constant. So estimating a log-log functional form produces a constant elasticity estimate in contrast to the linear model which produces a constant slope estimate.

Figure 4.7: Fitting a log-log functional form to the Car data example. Plots show the estimated relationship both in the original and the logarithmic scales.

R code
fit2 <- lm(log(Carbon) ~ log(City), data=fuel)
plot(jitter(Carbon) ~ jitter(City), xlab="City (mpg)",
  ylab="Carbon footprint (tonnes per year)", data=fuel)
lines(1:50, exp(fit2$coef[1]+fit2$coef[2]*log(1:50)))
plot(log(jitter(Carbon)) ~ log(jitter(City)),
  xlab="log City mpg", ylab="log carbon footprint", data=fuel)

Figure 4.8: Residual plot from estimating a log-log functional form for the Car data example. The residuals now look much more randomly scattered compared to Figure 4.4.

R code
res <- residuals(fit2)
plot(jitter(res, amount=.005) ~ jitter(log(City)),
  ylab="Residuals", xlab="log(City)", data=fuel)

Figure 4.8 shows a plot of the residuals from estimating the log-log model. They are now randomly scatted compared to the residuals from the linear model plotted in Figure 4.4. We can conclude that the log-log functional form clearly fits the data better.

Other useful forms are the log-linear form and the linear-log form. Table 4.1 summarises these.

Model Functional form Slope Elasticity
linear $y=\beta_0+\beta_1 x $ $\beta_1$ $\beta_1x/y$
log-log $\log y=\beta_0+\beta_1 \log x$ $\beta_1y/x$ $\beta_1$
linear-log $y=\beta_0+\beta_1 \log x $ $\beta_1/x$ $\beta_1 /y$
log-linear $\log y=\beta_0+\beta_1 x $ $\beta_1 y$ $\beta_1x$

Table 4.1: Summary of selected functional forms. Elasticities that depend on the observed values of y and x are commonly calculated for the sample means of these.

Figure 4.9: The four non-linear forms shown in Table 4.1.