4.10 Exercises

  1. Electricity consumption was recorded for a small town on 12 randomly chosen days. The following maximum temperatures (degrees Celsius) and consumption (megawatt-hours) were recorded for each day. 
    Day 123456789101112
    Mwh 16.316.815.518.215.217.519.819.017.516.019.618.0
    temp 29.321.723.710.429.711.99.023.417.830.08.611.8
    1. Plot the data and find the regression model for Mwh with temperature as an explanatory variable. Why is there a negative relationship?
    2. Produce a residual plot. Is the model adequate? Are there any outliers or influential observations?
    3. Use the model to predict the electricity consumption that you would expect for a day with maximum temperature $10^\circ $ and a day with maximum temperature $35^\circ $. Do you believe these predictions?
    4. Give prediction intervals for your forecasts. The following R code will get you started:
      R code
      plot(Mwh ~ temp, data=econsumption)
      fit  <- lm(Mwh ~ temp, data=econsumption)
      plot(residuals(fit) ~ temp, data=econsumption)
      forecast(fit, newdata=data.frame(temp=c(10,35)))
  2. The following table gives the winning times (in seconds) for the men’s 400 meters final in each Olympic Games from 1896 to 2012 (data set `olympic`).
    189654.2192847.8196445.1199243.50
    190049.4193246.2196843.8199643.49
    190449.2193646.5197244.66
    190850.0194846.2197644.27
    191248.2195245.9198044.60
    192049.6195646.7198444.27
    192447.6196044.9198843.87
    1. Update the data set `olympic` to include the winning times from the last few Olympics.
    2. Plot the winning time against the year. Describe the main features of the scatterplot.
    3. Fit a regression line to the data. Obviously the winning times have been decreasing, but at what *average* rate per year?
    4. Plot the residuals against the year. What does this indicate about the suitability of the fitted line?
    5. Predict the winning time for the men’s 400 meters final in the 2000, 2004, 2008 and 2012 Olympics. Give a prediction interval for each of your forecasts. What assumptions have you made in these calculations?
    6. Find out the actual winning times for these Olympics (see www.databaseolympics.com). How good were your forecasts and prediction intervals?
  3. An elasticity coefficient is the ratio of the percentage change in the forecast variable ($y$) to the percentage change in the predictor variable ($x$). Mathematically, the elasticity is defined as $(dy/dx)\times (x/y)$. Consider the log-log model, $$ \log y=\beta _0+\beta _1 \log x + \varepsilon . $$ Express $y$ as a function of $x$ and show that the coefficient $\beta_1$ is the elasticity coefficient.