4.2.2 The NeymanPearson approach
In his correspondence with Neyman, Pearson began to question Fisher’s consideration of the pvalue and the notion of significance. He was particularly disturbed by the temptation to use pvalues to attempt to prove that a hypothesis was true (although Fisher himself never fell into this trap). Together, Neyman and Pearson collaborated on extending Fisher’s ideas, and their work really laid the foundation for the traditional statistical approaches that are prevalent in today’s research literature.
The NeymanPearson framework extends Fisher’s approach in two specific ways. First, Neyman and Pearson concluded that the test of any hypothesis really did not make sense in the absence of an explicit alternative that could be assumed to be correct if the original hypothesis was rejected. Thus, they defined the hypothesis that was actually being tested the null hypothesis, denoted $H_{o}$. The alternative hypothesis was denoted $H_{a}$. A small pvalue, hence a significant hypothesis test, meant that $H_{o}$ was rejected in favor of $H_{a}$.
Second, instead of relying on some post hoc determination of the significance of the pvalue (i.e., how small does the pvalue need to be?), Neyman and Pearson argued that a critical probability level, which could be used to establish whether an observed pvalue was significant, must be set prior to conducting the test. This cutoff value is called $\alpha $. If $p \leq \alpha $, then the result of the hypothesis test is significant, and $H_{o}$ can be rejected in favor of $H_{a}$. If $p > \alpha $, then you fail to reject $H_{o}$. Notice that in this framework, the pvalue is not really viewed as a measure of the strength of evidence against $H_{o}$ as it was in Fisher’s original framework. Given $\alpha $, $H_{o}$ is either rejected, or it is not.
Together, $H_{o}$ and $H_{a}$ must predict all possible values of the parameter of interest, and the relationship between the two can be illustrated using our example. We are interested in the hypothesis that the mean of the population is 5. From this, three separate pairs of hypotheses can be developed^{1}.
$H_{o}: \mu \geq 5$
$H_{o}: \mu = 5$
$H_{o}: \mu \leq 5$
$H_{a}: \mu < 5$
$H_{a}: \mu \neq 5$
$H_{a}: \mu > 5$
Like in Fisher’s approach, to test any of these hypotheses, we focus on the sampling distribution of $\bar y$, assuming that $\mu = 5$. The critical region of this sampling distribution determines the values of an observed test statistic for which $H_{o}$ would be rejected. Its size is determined by $\alpha $ and its location is determined by $H_{a}$ (Fig.4.2).
Figure 4.2: In the NeymanPearson approach to hypothesis testing, $H_{o}$ is compared to a specific alternative hypothesis. Here, the sampling distribution of $\bar y$ is shown with the possible alternative hypotheses illustrated. In all cases, the shaded region indicates $\alpha $, which was set at 0.05. Its location is determined by $H_{a}$. In this example, $H_{o}$ is rejected (i.e., the result is significant) only when $H_{a}$ is $\mu > 5$. Thus, the specification of $H_{a}$ is an important step when proceeding with a null hypothesis test.
In Figs.4.2a and 4.2b, the observed test statistic $\bar y = 6.6$ does not fall in the critical region. As a result, both of those possible hypothesis tests would be nonsignificant. However, in Fig.4.2c, because the observed test statistic falls in the critical region, $H_{o}$ would be rejected in favor of $H_{a}$, and we would conclude that $\mu > 5$. Thus, the location of $H_{a}$ is critically important, and can change the overall interpretation of results!
The graphical approach to comparing the location of the observed test statistic to the critical region is equivalent to comparing the pvalue to $\alpha $. This is easily accomplished for each of the possible sets of hypotheses. For the first set of hypotheses, because of the direction of $H_{a}$ the pvalue falls in the left tail of the sampling distribution.
> Ha1.p
[1] 0.9541549
For the second set, $H_{a}$ considers both tails of the distribution. Here, we have to consider the fact that our sampling distribution is symmetric. As a result, the pvalue will always be double what is found in one tail.
> Ha2.p
[1] 0.09169028
Finally, the last $H_{a}$ considers the right tail.
> Ha3.p
[1] 0.04584514
Again, when comparing $p$ to $\alpha $, the result depends on $H_{a}$. In only the last set of hypotheses, $H_{o}$ would be rejected in favor of $H_{a}$. This is the reason that the NeymanPearson approach emphasizes the importance of specifying $H_{a}$.
When cast as a series of steps, the NeymanPearson approach can be described as:
generate a biological hypothesis
translate the biological hypothesis into a statistical hypothesis, frequently done in terms of a parameter
gather data
calculate the observed test statistic
calculate $p$
make a statistical conclusion based upon comparing $p$ to $\alpha $
draw a biological conclusion about the original biological hypothesis of interest
These steps form the foundation of what is sometimes referred to as classical null hypothesis testing.

In some cases, you may see all three sets of hypotheses listed with the same $H_{0}:\mu =5$. However, to be technically correct, and to ensure that the combination of $H_{o}$ and $H_{a}$ delineate all possibilities, $H_{o}$ should be listed as shown. ↩