3.6 Sampling distributions and confidence intervals
While discussing the CLT, we introduced the idea of repeatedly sampling a population, each time calculating a mean. The name for the resulting distribution of sample means is the sampling distribution of the mean. In fact, instead of calculating means, we could have calculated any number of statistics and then examined their resulting distributions. Thus, a sampling distribution is defined as the distribution of any statistic, with it being implied that such a distribution results from repeated sampling. In practice, many sampling distributions can be derived mathematically without actually having to resort to repeated sampling ^{1}. For example, the CLT tells us the sampling distribution of the mean. Furthermore, sampling distributions can be represented based on frequencies (as was done in Fig.3.7b), or, if derived from some theoretical expectation, they can be represented as PDFs. We refer to the standard deviation of any sampling distribution as the standard error of the statistic on which the sampling distribution is based.
Regardless of how it is derived, the sampling distribution represents an expectation about the possible values of the statistic that could have been obtained, assuming that multiple random samples were possible. As a result, it provides a mechanism for quantifying uncertainty in our estimates of a particular parameter (refer back to Fig.2.2).
For the sake of this discussion, let’s assume that we are interested in estimating the mean of a known population. We will further assume that the population follows a normal distribution with a known $\mu $ and $\sigma $. In fact, let’s assume that it is the same distribution from before, with $y \sim N(9,5)$. Of course in practice we will never really know these parameters (otherwise, why try to do any sampling!), but this assumption will facilitate an explanation of the development of measures of confidence. From this known population, we have collected a random sample of 10 items, from which we calculate $\bar y$. The relationship between the distribution of the actual population (again, you would never know this distribution in a real situation) and the sampling distribution of the mean is illustrated in Fig.3.8. Because the standard deviation of the sampling distribution, i.e., $\sigma _{\bar y}$, is calculated by dividing $\sigma $ by $\sqrt {n}$, it will always be narrower than the distribution of the original population.
Because we know something about this sampling distribution, we can base a measure of confidence in our sample statistic $\bar y$ on the amount of variability we expect to see in a sampling distribution with a standard error that is consistent with what we observed in our sample. In this discussion, because we know $\sigma $ from the original distribution, this will be $\sigma $ divided by $\sqrt {n}$. More specifically, we can base our measure of confidence on the region of the sampling distribution within which we expect a certain percentage of all possible $\bar y_{i}$ to occur. 95% commonly is used. Thus, mathematically, we want
The center of the formula should look familiar, it is basically the ztransformation except that here, we are transforming a sample mean. Thus, the appropriate standard deviation for the denominator is the standard error of the mean. Rearranging to isolate $\mu $, we have,
This statement tells us that, given the observed standard error, $\sigma_{\bar y}$, the probability that $\bar y  1.96\sigma_{\bar y} \leq \mu $ and $\bar y + 1.96\sigma_{\bar y} \geq \mu $ is 0.95. We will call this the 95% confidence interval^{2}. It can also be written as
Notice that because the standard normal distribution is symmetric, we
can simply state this confidence interval as the sample mean plus or
minus the quantile of the standard normal at 0.975 (i.e., $z_{.975}$)
times the standard error of the mean. (In the standard normal, which is
symmetric around zero, $z_{.025}=z_{.975}$. You should be able to
check this out using qnorm()
.) Another way of interpreting this
confidence interval (sometimes abbreviated CI) is, if you were to
repeatedly sample the same population, each time calculating a 95% CI,
then 95% of those intervals will contain the true population mean $\mu $. This interpretation is illustrated in Fig.3.9.
Going back to our example, we randomly obtained a sample of 20 items from a population with $y \sim N(9,5)$. Let’s say that $\bar y $ for our sample was 8.48. The 95% CI for this value would be calculated as follows.
[1] 6.288694
> 8.48 + qnorm(0.975) * 5/sqrt(20)
[1] 10.67131
or, more directly,
[1] 6.288694
> qnorm(0.975, 8.48, 5/sqrt(20))
[1] 10.67131
Another way of reporting these values is to use $\bar y = 8.48 \pm $2.19, where 2.19 came from
[1] 2.191306

In fact, although the sampling distribution is derived from repeated samples, this is rarely done. In reality, you most often only have one set of observations. ↩

Note that it is not correct to say that the probability that any one interval, calculated using this formula, contains $\mu $ is 0.95. Rather, 95% of all intervals so calculated will contain $\mu $. Because $\mu $ is fixed, the probability that any specific interval contains it is either 0 or 1. ↩