Scab Disease Data

Scab disease is a fungal infection that affects potatoes. The fungus does not grow well in acidic soil, so investigators designed a study to see whether adding sulphur to the soil would reduce the scab disease. In a completely randomized design, plots of land were randomly assigned to either a control condition or to several levels of sulphur that was spread on the land in the Spring or Fall. The amounts of sulphur were either 300 pounds per acre, 600 pounds per acre or 1200 pounds per acre. The potatoes were harvested at the end of the growing season. One hundred potatoes were randomly selected from each plot of land. The potatoes were washed, and then a lab assistant estimated the percent of each potato's surface that was infected with scab disease. The response variable is, for each plot of land, the mean percent of the potato's surface covered with scab disease. The explanatory variable is pounds of sulphur, in hundreds of pounds; the control is zero.


Control    35.26
Control    30.69
Control    15.56
Control    31.59
Control    15.91
Control    15.77
Control    19.07
Control    17.15
Spring300  19.96
Spring300  30.12
Spring300  11.78
Spring300   5.13
Spring600  20.63
Spring600  22.75
Spring600  18.22
Spring600  11.40
Spring1200 19.84
Spring1200 15.45
Spring1200  8.11
Spring1200 13.61
Fall300     5.52
Fall300    14.80
Fall300     5.08
Fall300    12.59
Fall600    16.14
Fall600    14.54
Fall600    11.10
Fall600    20.23
Fall1200    2.01
Fall1200    8.48
Fall1200    7.43
Fall1200    5.08

Click here for a convenient plain text version of this short data file.

Here are some questions we would like to answer. For each question, start by stating the null hypothesis in terms of μj values.

  1. Is amount of scab disease affected by experimental treatment?
  2. Is the average amount of scab disease for each treatment different from the average amount of scab disease in the control condition? That's six tests.
  3. Are there any other significant differences between experimental conditions?
  4. Is the average amount of scab disease in the 3 Fall conditions different from the control condition? This is one test.
  5. Is the average amount of scab disease in the 3 Spring conditions different from the control condition? This is one test.
  6. Is the average amount of scab disease in the 3 Spring conditions different from the average amount of scab disease in the 3 Fall conditions? This is one test.
  7. Is amount of scab disease affected by the amount of sulfur when the sulfur is applied in the Spring? This is one test.
  8. Is amount of scab disease affected by the amount of sulfur when the sulfur is applied in the Fall? This is one test.
  9. For each amount of sulphur, is the average amount of scab disease different depending on whether the treatment is applied in Fall or the Spring? That's three tests.

Suppose we want to protect all these tests simultaneously at a joint significance level of 0.05.

 

 

There are so many good questions and discussion topics that could be based on this example.

  1. Is n=4 per treatment really enough?
  2. Why not use the individual potatoes as experimental units?
  3. Why is the normal assuption pretty reasonable? (CLT, but not iid)
  4. Explanatory variable is numeric. Why not just do a linear regression?
  5. This brings up the possibility of testing departure from linearity.
  6. Is there a case for one-sided tests here?
  7. Is the amount of reduction worth the expense?
  8. Monitoring actual soil acidity, at various depths. Mediating variable, if that's what it's called.
  9. Degree of infestation is likely not controlled, and could be the most powerful influence. Consider blocking.
  10. Other potential response variables, like size and flavour of the potatoes!
  11. Possible interactions with fertilizers and other pesticides. At least these should be held constant.
  12. How would things be different if this were a purely observational study? If would be easier, you know. Just ask the farmers.

 


This document is based on an example in Cochran and Cox's (1958) classic text Experimental design.. The original data appear on page 97 of Cochran and Cox's book. The data above are carefully designed to give the same results as the original, without actually using their numbers. The R function used to reconstruct the data appears in a comment statement at the end of this document. View the html source to see it.

This document, including the data and the R function, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Use any part of it almost any way you like, as long as you share the results freely. See the license for details.