STA305 s14 Computer Assignment Two
Quiz in lecture
on Monday Jan. 27th
The Sulphur Data (Version One)
Scab disease is a fungal infection that affects potatoes. The fungus does not grow well in acidic soil, so investigators designed a study to see whether adding sulphur to the soil would reduce the scab disease. In a completely randomized design, plots of land were randomly assigned to either a control condition or to several levels of sulphur that was spread on the land in the Fall. The amounts of sulphur were either 300 pounds per acre, 600 pounds per acre or 1200 pounds per acre. The potatoes were harvested at the end of the growing season. One hundred potatoes were randomly selected from each plot of land. The potatoes were washed, and then a lab assistant estimated the percent of each potato's surface that was infected with scab disease. The response variable is, for each plot of land, the mean percent of the potato's surface covered with scab disease. The explanatory variable is pounds of sulphur, in hundreds of pounds; the control is zero.
0 35.26
0 30.69
0 15.56
0 31.59
0 15.91
0 15.77
0 19.07
0 17.15
3 5.52
3 14.8
3 5.08
3 12.59
6 16.14
6 14.54
6 11.1
6 20.23
12 2.01
12 8.48
12 7.43
12 5.08
Write a single SAS program to do the this assignment. This means you will not use %include. Also, there should be just one procedure output file.
- Use a class statement in proc means to obtain n, mean and standard deviation of the response variable for each treatment condition separately, including the control. I have not showed you how to do this, but a Google search of proc means class gave me the answer in less than a minute.
- Use proc reg to fit a regression model with an intercept and indicator dummy variables for experimental treatment. Everything you need for the quiz will be in the default output. We'll do fancier things with these data at a later time.
You will bring hard copy of your log file and procedure output file to the quiz, answer a few questions about the output, and hand in the output with your quiz. Please put your name and student number in the title statement. The following should give you an idea of what to expect on the quiz.
- What is the mean amount of scab infection for plots of land receiving 300 pounds of sulphur per acre? The answer is a number on your printout.
- What proportion of the variation in the response variable is explained by experimental treatment? The answer is a number on your printout.
- Do the four different treatments (including the control) differ in the expected amount of scab disease?
- Give the null hypothesis in terms of β values.
- Give the value of the test statistic. The answer is a number from your printout.
- Give the p-value. The answer is a number from your printout.
- Do you reject the null hypothesis at α = 0.05? Answer Yes or No.
- In plain, non-statistical language, what do you conclude? The answer is something about scab disease on potatoes.
- Does adding 300 pounds of sulphur per acre have an effect on expected amount of scab disease?
- Give the null hypothesis in terms of β values. Make this a two-sided test, but think about whether it should be one-sided.
- Give the value of the test statistic. The answer is a number from your printout.
- Give the (two-sided) p-value. The answer is a number from your printout.
- Do you reject the null hypothesis at α = 0.05? Answer Yes or No.
- In plain, non-statistical language, what do you conclude? The answer is something about scab disease on potatoes. Even though the test is two-sided, give a directional conclusion if possible.
- Estimate the change in expected amount of scab disease that results from adding 300 pounds of sulphur per acre to the field. The answer is a number. Do you need a calculator?
- Of course, anything you can do for 300 pounds, you should be able to do for the other amounts.
Bring both your log file and your procedure output file to the quiz. You will be asked to hand them in with your quiz. Remember, correct interpretation of the wrong numbers is not worth much, and correct numbers without understanding is worth nothing.
Please be reminded of the rules for computer assignments and quizzes. See the syllabus for more detail.
- You may copy freely from me, but do not look at anyone else's SAS code before the quiz.
- Do not write anything on the printouts.
- Important: Comment statements and other typed material that would help with interpretation of the computer output are expressly forbidden. For example, you may not type assignment questions or answers into your code, or otherwise cause them to appear on your log file or procedure output file. Any such material is an unauthorized aid for the purposes of this course, and if you use or possess an unauthorized aid, you will be charged with an academic offence.
- The log and procedure output files must be
generated at the same time by the same SAS program or you may lose a lot of marks.
- There must
be no errors or notes about invalid data in your log file.
This assignment is based on an example in Cochran and Cox's (1958) classic text Experimental design.. The data in this assignment is a reconstructed data set. The original data appear on page 97 of Cochran and Cox's book. The data in this assignment are carefully designed to give the same results as the original, without actually using their numbers. The R function used to reconstruct the data appears in a comment statement at the end of this document. View the html source to see it.
This assignment, including the data and the R function, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Use any part of it almost any way you like, as long as you share the results freely. See the license for details.