4.2.4 Bayesian approach

By abandoning the traditional notion of the p-value and using the concept of the likelihood, the LR approach to measuring the strength of evidence in support of one hypothesis, relative to another, takes one step away from the traditional hypothetico-deductive framework and the frequentist notion of probability. Bayesian approaches, on the other hand, take a giant leap away. In fact, Bayesian approaches represent a complete abandonment of traditional approaches [8].

In the Bayesian approach, parameters explicitly are considered random variables, and the data are considered fixed. The basic steps, modified from [11], are:

  1. develop a hypothesis or hypotheses

  2. represent the hypothesis or hypotheses as models

  3. specify the parameters of those models as random variables

  4. specify a prior probability distribution

  5. calculate the likelihood (this is Fisher’s likelihood which was just described in the previous section)

  6. calculate the posterior probability distribution

  7. interpret the results

Intuitively, the Bayesian approach is rather simple. It is founded on the following relation, derived from Bayes’ Theorem. We will use $\theta $ to represent a hypothesized parameter value.

\begin{equation} P(\theta \mid data) \propto P(\theta )P(data \mid \theta ) \end{equation}

The posterior probability distribution (usually called the “posterior” and denoted $P(\theta \mid data)$), is the outcome of interest. It is the probability distribution of our parameter (our hypothesis), given the observed data, and is proportional to the prior probability distribution (usually called the “prior” and denoted $P(\theta )$) multiplied by $P(data \mid \theta )$. The prior represents the biologist’s belief about the possible values of the parameter of interest, and is determined prior to any analysis. As described by [9], the prior can be interpreted in three different ways; a frequency distribution based on a synthesis of previous data, an objective statement about the distribution based on ignorance (i.e., an uninformative prior), or a subjective statement of belief based on previous experience. The likelihood (denoted $P(data \mid \theta )$) is derived from Fisher’s likelihood methods as described above.

Thus, the Bayesian approach abandons the frequentist notion of probability altogether by incorporating notions of belief (specifically via the prior). Although this can be a bit alarming to many scientists, it is actually more in line with common usage of the term ’probability.’ Moreover, instead of a p-value (and, at least superficially, a nicely packaged decision to reject/fail to reject an arbitrary null hypothesis), the result in a Bayesian analysis is a posterior probability distribution. This can roughly be translated as representing the probability of different values of the parameter of interest, given the observed data. This language may be better suited to communication with decision makers in a policy context because it is more closely aligned with common usage, and it more directly addresses the actual question of interest [6]. In addition, because it includes the prior, it contains a mechanism to incorporate prior knowledge. In fact, one potential benefit of the Bayesian approach is the manner in which knowledge about a tangible biological phenomenon (e.g., an effect size) can be continually updated as new information or observations are discovered. As a result, some have argued that it is ideally suited to environmental sciences, especially where adaptive management is being considered.

As a simple example, consider our hypothesis about $\mu $. In a Bayesian approach, we first characterize our belief about this parameter, based on prior knowledge or data. Let’s assume that we do in fact believe that the original population is normal with $\mu = 5$. We will also continue to assume that $\sigma = 3$. Thus, our prior is $y \sim N(5,3)$. We then gather a sample of 10 items and find $\bar y = 6.6$. The next step would be to calculate the likelihood and use it to derive a posterior distribution that describes our new belief about $\mu $. In Fig.4.3a, the prior, likelihood, and posterior for this example are illustrated. Based on our sample, our updated estimate of $\mu $ is actually 6.45. We can also determine the limits within which 95% of possible values of $\mu $ may occur, given the data. However, to avoid confusion with the confidence interval, we will call this interval the 95% credible interval. For this example, it is the interval from 4.68 to 8.22.

(a) Prior: $y \sim N(5,3)$
(b) Prior: $y \sim N(1,1)$
(c) Prior: $y \sim $Uniform

Figure 4.3: Illustration of the Bayesian approach. Here, the impact of three different prior distributions on the posterior distribution is shown. In all cases, the prior is shown in blue, the likelihood is in green, and the posterior is in black. The distributions have been re-scaled to facilitate the illustration, and the posterior estimate of $\mu $ is shown at the dashed line. When the prior distribution is uninformative (Panel C), the posterior primarily depends on the likelihood.

To illustrate the impact of the prior distribution, Fig.4.3 includes two other prior distributions. For example, with the prior $y \sim N(1,1)$, the posterior estimate and 95% credible interval are 3.95 (2.6 to 5.29). In some cases, you may not have good prior information on which to base an estimate of the prior. In these cases, an uninformative prior, such as a uniform distribution (where all values have equal density), can be used1. This is illustrated in Fig.4.3c, where the posterior estimate and 95% credible interval are 6.6 (4.74 to 8.45). Here, the lack of an informative prior means that the posterior mainly is influenced by the likelihood. Given these changes in the prior, notice how the posterior has changed. The fact that the prior (which, by definition, is somewhat subjective) can influence the outcome of a Bayesian analysis has been a major criticism of Bayesian methods. On the other hand, the impact of the prior on the posterior can be evaluated in a quantitative fashion.

Although Bayes Theorem is conceptually simple to understand, the calculation of the posterior distribution based on the prior and likelihood is difficult. Moreover, this difficulty is compounded when classical distributions such as the normal PDF do not adequately describe the distribution of the parameter of interest. In fact, in most cases, instead of trying to analytically determine the posterior distribution, sampling methods such as Markov-chain Monte Carlo (MCMC) simulation are used. This fact is probably a major reason why we have yet to see more widespread adoption of Bayesian methods. Although it is beyond the scope of this text, R is a great choice for pursuing Bayesian calculations. [1] provides a useful introduction that focuses on R, and [5] is a thorough introduction to the use of Bayesian models within the context of more complicated ecological models. There also is a parallel R manual [6].

  1. Actually, the uniform distribution introduces some mathematical difficulties that make calculation of the posterior very difficult. Thus, a very flat normal distribution (i.e., very large $\sigma $) usually is used to approximate a uniform distribution.