STA429/1007 Assignment 1

Quiz on Thursday Sept. 20th


This assignment is based on the Chapter 1 of the online text, and associated lecture material. Do it in preparation for Quiz 1; it is not to be handed in. Read Chapter 1 and think about the concepts. You can skim the details about the elementary tests, except please pay full attention to correlation and simple regression (that's one independent variable). The plan for this course is that we will start with simple regression, move up to multiple regression, and build other advanced methods from there. Thus we will by-pass the elementary tests, possibly observing them as special cases of what we develop.

Please pay attention to concepts like Independent variable, Dependent variable, Categorical variable, Quantitative variable, Statistcal significance, Significance level of a test, p-value, Definition of "unrelated" and "related" variables, Univariate, Multivariate, Independent observations, Repeated measures, Counterbalancing, Experimental versus observational studies, Confounding variables, Placebo effect, Correlation versus causation, Experimenter expectancy, Internal versus external validity.

On the quiz, You could be asked for definitions. You might be asked to make up an original example of a study with certain characteristics (see sample questions below). You could be asked what's the Independent Variable, the Dependent Variable, and how to set up the data file. Here are some sample questions to think about.

  1. Invent and briefly describe (in a few sentences at most) original studies with the following characteristics. Do not use any examples from lecture or the class notes. If the requested example is impossible, say so and explain why it is impossible. The word original is important. If you give an example that is overly similar to one from lecture or the class notes, your answer will receive a zero. If two people give exactly the same example, they will both get a zero for the question.

    1. A categorical independent variable and a continuous dependent variable.
    2. A continuous independent variable and a continuous dependent variable.
    3. A nominal scale independent variable and an ordinal scale dependent variable.
    4. An independent variable that is both quantitative and nominal scale, and a dependent variable that is continuous.
    5. Two categorical independent variables and two categorical dependent variables.
    6. A single categorical independent variable and two quantitative dependent varibles.
  2. In a study relating IQ score to birth order, which is the independent variable and which is the dependent variable?
  3. Make up an original example of a study that is multivariate, and both dependent variables are categorical.
  4. Give an example of a variable for which it would be unreasonable to compute the standard deviation.
  5. If p>.05, the results are significant and we can draw conclusions. True or False?
  6. Is it possible for a variable to be both categorical and quantitative? Give an example.
  7. What is the difference between a statistic and a parameter?
  8. In simple regression, if the slope of the least-squares line equals zero, what is the value of the correlation coefficient r?
  9. Explain how a single outlier could have a huge effect on the least-squares regression line. Draw a picture to illustrate your argument.
  10. A medical researcher conducts a study using twenty-seven litters of cancer-prone mice. Two members are randomly selected from each litter, and all mice are subjected to daily doses of cigarette smoke. For each pair of mice, one is randomly assigned to Drug A and one to drug B. Time (in weeks) until the first clinical sign of cancer is recorded.
    1. What is the independent variable (or variables)?
    2. What is the dependent variable (or variables)?
    3. Indicate how the data file would be set up.
    4. How could the design be modified to allow comparison of 3 drugs and a placebo? Is this still "repeated measures?"
    5. Presumably the mice are so cancer-prone that they all come down with the disease eventually. But this might not happen, especially if one of the drugs is very effective. Discuss two ways of handling the data from a mouse that died of old age, and never showed signs of cancer. Find a problem with both sollutions (there will be a problem, unless you know about survival anallysis).
    6. In this study, suppose the sample means are exactly identical for the various drug treatments. Is it possible for the population means to be different?
  11. What does it mean for two variables to be related in the population? Your answer must include the word "conditional," or it is wrong.
  12. Is it possible to have a study with repeated measures and a categorical dependent variable? If it is possible, make up an original example. If it is impossible, explain why.
  13. Is it possible for a study to be both experimental and observational? Explain.
  14. It is well known that people who graduate from university have higher lifetime earnings on average than those who do not. Discuss at least one confounding variable that could have produced this result.
  15. Is it possible for Independent Variable and Dependent Variable to be related in the population and unrelated in the sample?
  16. Is it possible for Independent Variable and Dependent Variable to be related in the sample but unrelated in the population?
  17. Is it possible for Independent Variable and Dependent Variable to be related in the population and related in the sample, but not significantly related?
  18. Is it possible for Independent Variable and Dependent Variable to be related in the population and also significantly related in the sample, but in the wrong direction?
  19. Suppose that volunteer patients undergoing elective surgery at a large hospital are randomly assigned to one of three different pain killing drugs, and one week after surgery they rate the amount of pain they have experienced on a scale from zero (no pain) to 100 (extreme pain).
    1. Indicate how the data file would be set up.
    2. What is the independent variable?
    3. What is the dependent variable?
    4. What statistical test would you recommend?
    5. Is this an experimental study, or observational?
    6. Why is it important for the patients to be unaware of which drug they are receiving? Relate this to the idea of a confounding variable.
    7. Is it also important for the physicians to remain unaware of what drugs their patients are getting? Why or why not?
    8. Is it also important for the person administering the questionnaire to remain unaware of what drug each patient is getting? Why or why not?
    9. In this study, suppose the population means were exactly identical for the various drug treatments. Would it be possible for Independent Variable and Dependent Variable to still be related in the population? Explain.
    10. What "population" are we talking about here, anyway?
  20. In a study relating physical attractiveness to academic performance, six judges rated attractiveness on a 10-point scale, from photos of 100 randomly chosen first-year students. The data file contained 10 variables: Six attractiveness ratings, sex of student, number of credits completed by the end of first year, cumulative Grade Point Average (GPA) at the end of first year, and a binary variable indicating whether the student was still enrolled at the end of first year. An eleventh variable, mean attractiveness rating, was calculated from the 6 ratings, and was taken to be the definition of "attractiveness."
    1. Is this an experimental study, or observational?
    2. Would it make sense to compute the average correlation of the attractiveness ratings with one another? What would it tell us, if anything? How many numbers would we be averaging?
    3. Which variables are independent variables, and which are dependent?
    4. What statistical test would you recommend for assessing the relationship between attractiveness and GPA?
    5. Give an example of an unmeasured variable that is a potential confounding variable in this study. Explain how it might produce an apparent relationship between attractiveness and GPA even if no real relationship existed.
    6. It is suggested that order of presentation be counterbalanced, so that each student has approximately the same mean order of presentation when averaging over judges. Why is this a good idea?
  21. Answer the following questions T for true or F for false; write the letters on the lines. Assume the significance level is alpha = .05 in all cases.
    1. ____ In an experimental study, a statistically significant relationship between the independent variable and the dependent variable can provide some evidence of a causal relationship.
    2. ____ In simple regression, a positive regression coefficient b1 implies that high values of X tend to go with low values of Y and low values of X tend to go with high values of Y.
    3. ____ We observe r = -0.70, p = .009. We conclude that X and Y are unrelated.
    4. ____ An observational study is one in which cases are randomly assigned to the different values of an independent variable.
    5. ____ If p < .05 we say the results are statistically significant at the .05 level.
    6. ____ We seek to predict the dependent variable from the independent variable.
    7. ____ We observe r = 0.50, p = .002. This means that 50% of the variation in the dependent variable is explained by a linear relationship with the independent variable.
    8. ____ When a relationship between the independent variable and the dependent variable is statistically significant, we conclude there is no evidence that the two variables are actually related.
    9. ____ The greater the p-value, the stronger the evidence that the independent and dependent variable are related.
    10. ____ We would like to predict type of automobile owned (North American vs. other) from income, education and sex of owner. This is impossible, because the dependent variable is categorical.