STA441s20 Assignment 3
Quiz on Monday Jan. 27 in tutorial
This assignment mostly asks for elementary tests. The ideas were covered in
lecture slide set 1; also see your answer to Question 22 of Assignment 1. Doing
elementary tests with SAS was not specifically covered in lecture, though
there are some examples. See Chapter 2 of the textbook and all the
lecture material to this point. Be resourceful.
I do suggest that you avoid proc ttest. If you need a two-sample t-test,
do the equivalent F-test with proc glm. If you need a matched t-test, use
proc means as illustrated in the text.
Wisconsin Power and Light studied the effectiveness of two devices for
improving the efficiency of gas home-heating systems. The electric
vent damper (EVD) reduces heat loss through the chimney when the
furnace is in the off cycle by closing off the vent. It is controlled
electrically. The thermally activated vent damper (TVD) is the same as
the EVD except it is controlled by the thermal properties of a set of
bimetal fins set in the vent. Ninety test houses were used, 40 with
EVDs and 50 with TVDs. For each house, energy consumption was measured
for a period of several weeks with the vent damper active ("vent damper
in") and for an equal period with the vent damper not active ("vent
damper out". Here Are the variables:
House Identification Number
Type of furnace (1=Forced air 2=Gravity 3=Forced water)
Chimney area
Chimney shape (1=Round 2=Square 3=Rectangular)
Chimney height in feet
Type of Chimney liner (0=Unlined 1=Tile 2=Metal)
Type of house (1=Ranch 2=Two-story 3=tri-level
4=Bi-level 5=One and a half stories)
House age in yrs (99=99+)
Type of damper (1=EVD 2=TVD)
Energy consumpt with damper active (in)
Energy consumpt with damper inactive (out)
The raw data are available in furnace.data.txt. There is a lesson here. Never trust what you are told about a data file. This is nothing compared to what you will encounter in practice. When in doubt, use common sense.
Write a SAS program that reads and labels the data, including proc format where appropriate. Create the following new variables:
- Average energy consumption with vent damper in and vent damper out.
- The difference between energy consumption with vent damper in and vent damper out.
- A categorical variable with three values: Ranch, Two-story and Other.
Then,
- Run proc means to obtain sample sizes, means, medians and standard deviations of the quantitative variables. Search for proc means online to see how to get medians; this is the quickest way to access documentation. Run proc freq to get frequency
distributions of the categorical variables (a variable may occur in both
sets). Be able to answer basic questions like "What is the median chimney height?" (my answer is 20 ft. -- What?! How many feet in a meter?), or "What percentage of houses have a forced water furnace?" (my answer is 7.78%)
- Ignoring all other variables, test whether there is
more energy consumption with the damper active or the damper inactive. Be able to answer questions like the following:
- What is the value of the test statistic? The answer is a single number from the printout.
- What is the p-value? The answer is a number or range of numbers from the printout.
- Do you reject the null hypothesis at the 0.05 level? Answer Yes or No.
- Are the results statistically significant at the 0.05 level? Answer Yes or No.
- In plain, non-technical language, what do you conclude, if anything? Say something about furnaces and vent dampers. Statistical terminology is absolutely not allowed here. Pretend you are writing a quick email to your boss, who failed the only Statistics course he ever took, and is touchy about it. Remember that he may forward the email to other "statistical experts" he knows, so beware of accepting H0 if the test is not significant.
- If you observe the shape of a house's chimney, does that
improve your ability to predict what type of vent damper the house has?
- What is the value of the test statistic? The answer is a single number from the printout.
- What is the p-value? The answer is a single number from the printout.
- Do you reject the null hypothesis at the 0.05 level? Answer Yes or No.
- Are the results statistically significant at the 0.05 level? Answer Yes or No.
- In plain, non-technical language, what do you conclude, if anything? Say something about chimney shape and type of vent damper. Statistical terminology is absolutely not allowed here. Pretend you are writing a quick email to your boss …
- Is there a tendency for houses that consume lots of energy
with the vent damper inactive to also consume a lot of energy with the vent
damper active?
- Answer the question Yes or No. If the answer is No, add a sentence that protects you against accusations that you are accepting the null hypothesis.
- What is the value of the test statistic? The answer is a single number from the printout.
- What is the p-value? The answer is a single number from the printout.
- Do you reject the null hypothesis at the 0.05 level? Answer Yes or No.
- Are the results statistically significant at the 0.05 level? Answer Yes or No.
- What proportion of the variation in energy consumption with vent damper active is explained by energy consumption with vent damper inactive? Your answer is a single number. You can use a calculator, and you probably will want to bring a calculator to the quiz.
- The equation for predicting energy consumption with vent damper active from energy consumption with vent damper inactive is
Predicted Y = b0 + b1 X.
What are b0 and b1? The answer is a pair of numbers from your printout. (One of the residuals may be an outlier, but don't worry about that for now.)
- For a house that consumed 10 BTU (British Thermal Units) with vent damper out, what is the predicted energy expenditure with vent damper in? A calculator may be helpful here.
- Do the two types of vent damper differ in the amount of energy they save? Your response variable should be the difference of two variables in the raw data file.
- Answer the question Yes or No. If the answer is No, add a sentence that protects you against accusations that you are accepting the null hypothesis.
- What is the value of the test statistic? The answer is a single number from the printout.
- What is the p-value? The answer is a single number from the printout.
- Do you reject the null hypothesis at the 0.05 level? Answer Yes or No.
- Are the results statistically significant at the 0.05 level? Answer Yes or No.
- What is the mean saving in energy consumption by using an electrical vent damper (compared to not using it)? The answer is a single number from your printout.
- What is the mean saving in energy consumption by using a thermal vent damper (compared to not using it)? The answer is a single number from your printout.
- Does average energy consumption (mean of consumption with vent damper active and vent damper inactive) depend on type of chimney liner?
- Answer the question Yes or No. If the answer is No, add a sentence that protects you against accusations that you are accepting the null hypothesis.
- What is the value of the main test statistic? The answer is a single number from the printout.
- What is the p-value? The answer is a single number from the printout.
- Do you reject the null hypothesis at the 0.05 level? Answer Yes or No.
- What proportion of the variation in average energy consumption is explained by type of chimney liner? The answer is a single number from the printout.
- In plain, non-technical language, what do you conclude, if anything? Say something about chimney liners and energy consumption. If the overall test is significant, base your conclusions on Bonferroni pairwise multiple comparisons. Statistical terminology is absolutely not allowed here. Pretend you are writing a quick email to your boss …
Bring both your log file and your results file to the quiz. Do not
write anything on the printouts except your name and student number. You may
be asked to hand one or both of them in. These two files must be
generated by the same SAS program or you may lose a lot of marks. There must
be no errors, no warnings and no notes about invalid data in your log file.
Bring a calculator.