11.2.4 Odds ratios

To start, let’s just look at how the odds of dying may depend on sex (ignoring concentration for the time being). Note that now we have to use the data argument to tell R that the variables dead, alive, and sex are within the dataframe titled d.

     >  m2 = glm(cbind(dead, alive) ~ sex, family = binomial("logit"),
    + data = d)
     >  summary(m2)

    Call:
    glm(formula = cbind(dead, alive) ~ sex, family = binomial("logit"),
     data = d)

    Deviance Residuals:
     Min 1Q Median 3Q Max
    -4.7887 -2.9371 0.1015 2.3400 4.9522

    Coefficients:
     Estimate Std. Error z value Pr( > |z|)
    (Intercept) -0.4754 0.1878 -2.532 0.0113 *
    sexm 0.6425 0.2623 2.449 0.0143 *
    ---
    Signif. codes: 0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1

    (Dispersion parameter for binomial family taken to be 1)

     Null deviance: 124.88 on 11 degrees of freedom
    Residual deviance: 118.80 on 10 degrees of freedom
    AIC: 152.91

    Number of Fisher Scoring iterations: 4

Like in the ANOVA models we have seen before, R is using treatment contrasts to develop a dummy variable that codes for sex. Thus, the intercept is the $ln(\frac{p}{1-p})$, or the log-odds of death, for females (i.e., $ln(\text{Odds}_{\text{females}})$). The estimate for sexm represents the change in the log-odds going from female to male. In fact, it is:

\begin{equation} \beta _{1} = ln\left(\frac{Odds_{males}}{Odds_{females}}\right)\end{equation}

We can use exp() to get rid of the log (remember that ’log’ in R refers to the natural log).

     >  exp(coef(m2)[2])

     sexm
    1.901186

Thus, the odds (not the probability!) of death for males is 1.9 times the odds of death for females. Another way of interpreting this is that the odds of death for males is 90% greater than the odds of death for females. This number is referred to as the odds ratio (OR). It is

\begin{equation} OR = e^{\hat\beta _{i}} = \frac{Odds_{males}}{Odds_{females}}\end{equation}

(See if you can’t calculate this by hand using the raw data.) The interpretation of coefficients from logistic regression as odds ratios is an important concept! An OR of 1 would indicate that the odds of death are equal for the two groups being compared.