Assignment 4


You will be asked to hand this assignment in at the beginning of class on Tuesday Feb. 10th.

  1. You need to plan sample size for a 2x3x2 factorial experiment. Call the factors A, B and C. Factor A has 2 levels, factor B has 3 levels and factor C has 2 levels. Data will be analyzed with the normal linear model. Using the usual alpha = 0.05 significance level, you want to be able to detect a main effect for B with power 0.80 if two of the marginal means are equal to each other, and the third marginal mean is half a sigma greater than the other two. Remember, testing for "main effects" means testing for equality of marginal means.
    1. Please write down your dummy variable coding scheme so I know what your betas are.
    2. What is your C matrix? Please be careful here; you want to be a vector of differences between marginal means, not marginal sums.
    3. Do the requested power analysis assuming all the treatment sample sizes are equal. Attach your printout. I got n = 177, which I increased to n=180 to maintain equal sample sizes.
    4. Now, try experimenting with some different relative treatment sample sizes. Your objective is to try to find the smallest total sample size that will yield a power of 0.80 for this particular effect. No relative sample size should be zero, and in other respects try to have nice relative sample sizes. Attach the printout showing only the best set of relative sample sizes you found.

      As I expected, many sets of relative sample sizes are optimal. That is, they all yield the same power, and it will be the highest possible power for this effect, given any total sample size.

      What on earth do I mean by "nice" relative sample sizes? Well, started by trying a set of relative sample sizes that are unequal, but still proportional. My final answer was not quite proportional.

  2. There are three accepted methods for doing a particular risky surgical procedure. We want to know if they differ in their probability of success. Patients are randomly assigned to each of the three methods, surgery is performed, and we record whether the surgery was successful or not.

    There are several good ways to test the equality of two or more proportions. I want you to use logistic regression with dummy variables, fit a full and a reduced model, and test the difference between them with a large-sample likelihood ratio test. Here are the data. Surgical methods are labelled 1, 2 and 3. Success=1 means the operation was a success.

    1. What dummy variable scheme are you going to use? Denoting the three probabilities of success by pi1, pi2 and pi3, show that your reduced model holds if and only if pi1=pi2=pi3.
    2. Attach the printout. Circle the p-value of the test. Using alpha=0.05, do you conclude that the three procedures differ in probability of success? Write "Yes" or "No" beside the p-value.
    3. Which procedure appears to be the best? I am not asking for a significance test here. Just look at the statistic or statistics of your choice. Write the answer on your printout.

    For this question, I got a p-value of 0.005664708. By the way, applying the anova function to just the full model gives you the right chi-square value. It's labelled "Deviance," because -2 times the log likelihood is a sum of terms, each of which is called a "deviance residual." The whole sum is called the "Deviance" of the model. The difference in deviance between the full and reduced models is exactly -2 times the log of the likelihood ratio, which has an approximate chisquare distribution under H0.

    If you want, you can fit a reduced model, and do  anova (reducedmodel,fullmodel). I got a reduced model with

    glm(success~1,family=binomial) # Just the intercept

Please feel free to use any of my S code from lecture. Just copy and paste. Here is a copy.