5.4 The interpretation

Obviously, our observed $\chi ^{2}$ = 92.12 is much larger than the critical value, 11.07. In addition, our p-value, which rounds to 0, is much lower than our pre-detrimined $\alpha = 0.05$. Thus, we reject $H_{o}$ in favor of $H_{a}$ and conclude that the distribution of plants is not random. To facilitate the interpretation of this result, it is useful to ask, "In what way does the observed frequency distribution differ from what the poisson PMF predicts?" Looking at Fig.5.2, we can see that there were more observations at the extremes of the distribution (zeros and large values) than we would have expected.

The Coefficient of Dispersion ($CD$) is a useful way to quantify this difference. By definition, the coefficient of dispersion is

\begin{equation} CD = \frac{\sigma ^{2}}{\mu }. \end{equation}

Thus, it is just the ratio of the variance to the mean. You may also see this referred to as the variance-to-mean ratio (VMR). When dealing with a sample, we can calculate the observed $CD$ using $s^{2} / \bar y$. For these data, we first need to calculate $s^{2}$.

     >  v = var(rep(0:8, f_obs))
     >  v

    [1] 2.250758

To calculate $s^{2}$, we again used rep() to create the original data vector. Now we can calculate the $CD$.

     >  cd = v/ybar
     >  cd

    [1] 1.594021

To interpret this value, we have to know something about the expected distribution. Going back to Section 3.3, we saw that for a poisson (i.e., random) frequency distribution of counts, $\mu = \sigma ^{2}$. Thus, for a poisson distribution, the $CD =1$, and we can use one as a benchmark for interpretation. If $CD > 1$, we say that the data are more clumped than we would have expected. Another way of saying this is that the data are overdispersed. If $CD < 1$, we say that the distribution of observations is more regular or even than we would have expected. This is the same thing as saying that the data are underdispersed.

So, for our data, with a CD of 1.59, the data are more clumped than we would expect due to chance alone. It should be noted that, had we failed to reject $H_{o}$, it would not have made sense to interpret the CD. Any deviation from the theoretical value of one would merely have been due to random sampling error.