library(faraway)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

This was a designed experiment: 125 fruitflies were divided into 5 groups of 25 each, and each group was assigned a different treatment. (It is conventional to refer to ``treatments’’ in DoE, even if it’s not a drug.) In this case the treatments were expsoure to female fruitflies. One group was solitary, another with a single virgin female, another with eight virgin females, another with 1 pregnant female and another wtih 8 pregnant females. Pregnant females will not mate.

The response variable was longevity, measured in days. A continuous covariate thorax was also recorded: it is the length of the thorax in millimeters, and known to be associated with longevity. The goal of the study was to see if sexual activity has a cost to male fruitflies, in terms of longevity.1

First, a look at the data:

head(fruitfly);dim(fruitfly)
##   thorax longevity activity
## 1   0.68        37     many
## 2   0.68        49     many
## 3   0.72        46     many
## 4   0.72        63     many
## 5   0.76        39     many
## 6   0.76        46     many
## [1] 124   3
is.factor(fruitfly$activity)
## [1] TRUE
ggplot(aes(x=thorax, y=longevity), data = fruitfly) + geom_point() + facet_wrap(~ activity)

It is clear that there is a linear relationship between thorax length and longevity, as expected.

Two questions of interest are whether there is a difference in longevity between activity groups, and whether the linear relationship between thorax and longevity is the same in each group. The latter is perhaps not that important in this example, but if it is the same, the resulting model will be simpler.

lmod <- lm(longevity ~ thorax * activity, data = fruitfly)
summary(lmod)
## 
## Call:
## lm(formula = longevity ~ thorax * activity, data = fruitfly)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.9509  -6.7296  -0.9103   6.1854  30.3071 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -50.2420    21.8012  -2.305    0.023 *  
## thorax              136.1268    25.9517   5.245 7.27e-07 ***
## activityone           6.5172    33.8708   0.192    0.848    
## activitylow          -7.7501    33.9690  -0.228    0.820    
## activitymany         -1.1394    32.5298  -0.035    0.972    
## activityhigh        -11.0380    31.2866  -0.353    0.725    
## thorax:activityone   -4.6771    40.6518  -0.115    0.909    
## thorax:activitylow    0.8743    40.4253   0.022    0.983    
## thorax:activitymany   6.5478    39.3600   0.166    0.868    
## thorax:activityhigh -11.1268    38.1200  -0.292    0.771    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.71 on 114 degrees of freedom
## Multiple R-squared:  0.6534, Adjusted R-squared:  0.626 
## F-statistic: 23.88 on 9 and 114 DF,  p-value: < 2.2e-16

(I’ve just done what you’re not supposed to do, shown all the output in its ugliness.) The notation a*b on the right side is shorthand for a + b + a:b, where a:b is the interaction term.

How does lm get the output? Try below, and then look at model.matrix(lmod2) or model.matrix(lmod). (They are long, you might want just the first 10 rows or so.)

With factor variables, an analysis of variance table is helpful, because you can see the amount of variation explained by the factor by an \(F\)-test.

anova(lmod)
## Analysis of Variance Table
## 
## Response: longevity
##                  Df  Sum Sq Mean Sq F value    Pr(>F)    
## thorax            1 15003.3 15003.3 130.733 < 2.2e-16 ***
## activity          4  9634.6  2408.6  20.988 5.503e-13 ***
## thorax:activity   4    24.3     6.1   0.053    0.9947    
## Residuals       114 13083.0   114.8                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
lmod2 <- lm(longevity ~ thorax + activity, data = fruitfly)
anova(lmod2)
## Analysis of Variance Table
## 
## Response: longevity
##            Df  Sum Sq Mean Sq F value    Pr(>F)    
## thorax      1 15003.3 15003.3 135.069 < 2.2e-16 ***
## activity    4  9634.6  2408.6  21.684 1.974e-13 ***
## Residuals 118 13107.3   111.1                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(longevity ~ activity + thorax, data = fruitfly))
## Analysis of Variance Table
## 
## Response: longevity
##            Df Sum Sq Mean Sq F value    Pr(>F)    
## activity    4  12270  3067.4  27.614 3.481e-16 ***
## thorax      1  12368 12368.4 111.348 < 2.2e-16 ***
## Residuals 118  13107   111.1                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lmod2)
## 
## Call:
## lm(formula = longevity ~ thorax + activity, data = fruitfly)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -26.108  -7.014  -1.101   6.234  30.265 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -48.749     10.850  -4.493 1.65e-05 ***
## thorax        134.341     12.731  10.552  < 2e-16 ***
## activityone     2.637      2.984   0.884   0.3786    
## activitylow    -7.015      2.981  -2.353   0.0203 *  
## activitymany    4.139      3.027   1.367   0.1741    
## activityhigh  -20.004      3.016  -6.632 1.05e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.54 on 118 degrees of freedom
## Multiple R-squared:  0.6527, Adjusted R-squared:  0.638 
## F-statistic: 44.36 on 5 and 118 DF,  p-value: < 2.2e-16

What is the difference between the two anova tables? Note that when we omit the interaction term, activity shows a large difference between levels; in particular those in the high category have a much reduced lifetime, relative to the baseline category. (What is the baseline category?)

The \(p\)-value for this in the table is \(10^{-9}\) which shouldn’t be taken very seriously (why not?). A table of sample means in the five groups is more informative (I found this code by googling):

group_by(fruitfly, activity) %>% summarize(mean = mean(longevity))
## # A tibble: 5 × 2
##   activity  mean
##   <fct>    <dbl>
## 1 isolated  63.6
## 2 one       64.8
## 3 low       56.8
## 4 many      64.5
## 5 high      38.7

To test differences between any two group means we will need to estimate the standard error of the difference; this is not difficult, but a bit tedious. The original paper didn’t do this, just reported a “\(p < 0.001\)” for the high activity group, and “\(p < 0.05\)” for the low activity group.

Often with several groups the textbook recommendation is to do all pairwise comparisons of the means, and I’m sure that there is R code out there somewhere that will quickly do that for you, and offer a choice of adjustments for multiple testing.

For this particular experiment, though, the groups “one” and “many” were described as controls, as was the group “isolated”, so the comparisons of interest are fairly limited. The comparisons of most interest are probably “isolated” against “one” and “many”, as these are slightly different types of controls, and then “low” and “high” against the average of the 3 control groups. The effect is so striking here, though, that this detail may not be needed.


  1. Partridge and Farquhar, 1981, Nature 284, p.580.Available here↩︎