STA429/1007 Assignment 10

Quiz on Friday March 30 at 12:10 p.m.


The file noise.dat comes from a study in which men and women in 3 different age groups are tested on their ability to understand a conversation about politics under 5 different levels of background noise. There are 10 women and 10 men in each age group for a total n = 60. Order of presentation of noise levels was randomized for each subject, and the subjects themselves were tested in random order.

There are 5 lines of data for each case.

  1. In the first column (severaal characters wide) is identification number, repeated 5 times.
  2. In the second column is rated interest in politics, repeated 5 times. We will not use this variable in the present assignment.
  3. In the 3d column is sex (1=F, 0=M), repeated 5 times.
  4. In the 4th column is age group (1, 2 or 3), repeated 5 times.
  5. In the 5th column is noise level. Line one always has a 1, line 2 has a 2, etc..
  6. In the 6th column is the time at which the noise level was presented. Values are 1 through 5, and vary from subject to subject. Not all orders were present in the study (n < 5 factorial), but still the order is counterbalanced. We will not use the order variable.
  7. In the 7th column is discrimination score, with higher values indicaing better perception/understanding. There are 5 different values, one for each noise level. We will use only the discrimination score at noise level 5, the one on the last line.

We are only going to use three variables in this assignment: Age group, sex and discrimination score at noise level 5. As long as you read these three variables for n=60 cases, you are fine. To make sure we are doing it the same way, my sample mean for discrimination score at noise level 5 is 31.445.

There are two natural ways to read the data. One way is to read all 5*7=35 variables per case, as in tuberead.sas. Another, less tedious way is to read just the data you want, taking advantage of the fixed column format. To use this approach see the input statement of senicread.sas. In the noise data we have "n=5" lines of data per case. Specify #1 to read from line 1, then read sex and age group from the appropriate columns. Still in the input statement, specify #5 to read from line 5, and then read the discrimination score from the appropriate columns. To make sure it worked, I recommend a cross-tabulation of sex by age and a proc means on discrimination score.

Make a single categorical independent variable consisting of all the age-sex combinations. This variable takes on 6 values. You will be testing contrasts of the group means.

The dependent variable is discrimination score at noise level 5. We'll just call it "discriminaton score."

Later, we are going to do a set of custom contrasts and convert them to Scheffé tests, so you'll need a table of critcal values. Produce one by modifying the proc iml code in kenton.sas. I found it better to put this at the end of my program.

First, do an overall one-way F test. Are there significant differences among the 6 group means? Be able to specify the numerical value of the test statistic, the p-value, etc..

Follow up with Scheffé tests. First do all pairwise comparisons of group means; you can and should do these with the means statement rather than setting up custom contrasts. Because the sample sizes are all equal, you get tests in a convenient format, not confidence intervals. Which comparisons, if any, are statistically significant? Give a one-sentence description of the results in plain language.

The table below will help in specifying the hypotheses you are to test.

                Age Group
Sex      1         2        3
  F     mu11      mu12      mu13
  M     mu21      mu22      mu23
Now set up custom tests of contrasts to answer these questions, converting the tests to Scheffé tests by comparing the test statistics to modified critical values. For any significant results, you should be able to state the results in simple, non-statistical language.
  1. Averaging across age groups, are the mean discrimination scores different for male and female subjects? What I mean is test H0: (μ111213)/3 = (μ212223)/3. Of course you can multiply both sidesby 3 to simplify.
  2. Averaging across sex, are the mean discrimination scores different for the three age groups? What I mean is test H0: (μ1121)/2 = (μ1222)/2 = (μ1323)/2.
  3. What we have been doing in the first two questions is testing for differences among "marginal means." Now please test all 3 pairwise comparisons of marginal means for age group. You are being asked to conduct three tests. What are your conclusions? Would they be different if you were conducting one-at-a-time tests instead of Scheffé tests? In any significant comparison, be able to say which marginal mean is greater.

Of course many more tests are possible, but at this point I have my main conclusion.

Please bring your log file and list file to the quiz.