Assignment Six: Quiz on Friday March 2nd

          

The quiz will be based on Chapter 5 and associated lecture material.


This assignment uses the Furnace data of Assignment Three. The main purpose of this study is to see which kind of vent damper uses more energy, thermally activated or electrically activated. The vent damper can't do anything when it's inactive (out), but still energy consumption with vent damper active is strongly related to energy consumption with vent damper inactive. That's because a huge number of important unmeasured variables (insulation, exposure to the wind, total surface area, efficiancy of the furnace, how much heat the residents like, etc.) are identical for the two measurements, because it's the same house. So, let's take energy consumption with vent damper in as the dependent variable. We'll use energy consumption with vent damper out (inactive) as a covariate, and see what else matters.

Note that in this assignment, there are no interactions. We'll get to that later.

  1. First, use proc reg to fit a regression model in which energy consumption with vent damper out is the only independent variable. What proportion of the variation does it explain?
  2. Make indicator dummy variables for the following categorical independent variables. You should check to make sure you did it right, but you need not print the frequency tables. For each of these variables, use proc reg to test whether it is related to energy consumption with vent damper in, once you control for energy consumption with vent damper out. Be able to give the value of the test statistic, the p-value, and whether the results are statistically significant. Be able to state your conclusions (if any) in plain, non-statistical language. Make sure you cannot be accused of accepting the null hypothesis. Check your work with proc glm. Agaain, you are considering each of these variables one at a time, controlling for energy consumption with vent damper out, but not controlling for each other, yet.
  3. Consider the last item, Type of vent damper controlling for energy consumption with vent damper inactive. Please obtain the least squares means as part of your proc glm output. Be able to reproduce the least squares means (using a calculator) from your proc reg output.
  4. Using proc reg, fit a full model. The independent variables are Energy consumption with vent damper inactive, Chimney area, Chimney height, Age of house, and dummy variables for the categorical independent variables of Question 2. Test each varible controlling for all the others. For the categorical independent variablles with more than two categories, this means using the test statement. Be able to give the value of the test statistic, the p-value, and whether the results are statistically significant. Be able to state your conclusions (if any) in plain, non-statistical language. Make sure you cannot be accused of accepting the null hypothesis. Check your work with proc glm.
  5. Starting with the full model, try stepwise selection with the significance level for entry to the model and significance level for staying in the model moth equal to 0.05. You are responsible for understanding all the output except C(p). What model do you arrive at?
  6. Now do the same, except using selection = backward instead of selection = stepwise. Does this suggest a different model?
  7. Based on the results of the backward variable selection, fit a model in which Energy consumption with vent damper active depends on Energy consumption with vent damper inactive, Chimney area, and Type of chimney liner. This is the full model for the following questions. If you start with proc reg simple, you will get simple descriptive statistics that will be useful later. Here's the question: Controlling for Energy consumption with vent damper inactive and Chimney area, is Type of chimney liner related to Energy consumption with vent damper active?
    1. Give the value of the test statistic. The answer is a number from the printout.
    2. What is the p-value? The answer is a number from the printout.
    3. Do you reject the null hypothesis at α=0.05? Answer Yes or No.
    4. Are the results statistically significant at the 0.05 level? Answer Yes or No.
    5. After allowing for Energy consumption with vent damper inactive and Chimney area, what proportion of the remaining variation in Energy consumption with vent damper active is explained by Type of chimney liner? The answer is a number between zero and one. Show a little work.
    6. Using regression output, calculate three least squares means, one for each type of chimney liner.
    7. Based on proc reg output, carry out all pairwise comparisons of means for the three types of chimney liner. Use a Bonferroni correction. Calculate all the Bonferroni-corrected p-values.
    8. In simple, non-technical lnguage, what do you conclude? Include a statement that (allowing for ...) houses with certain types of chimney liner use (more, less) energy.
    9. Check your least squares means and Bonferroni-corrected p-values with proc glm.

Bring your log and list files to the quiz. Do not write anything on the printouts in advance except your name and student number. You may be asked to hand them in. The log and list files for each data set must be generated by the same SAS program or you may lose a lot of marks. There must be no errors or warnings in your log files. There must be no notes about invalid data. Bring a calculator to the quiz.