STA429/1007 Assignment 6

Quiz on Thursday Nov. 8th at 10:10 a.m.


You will find Chapter 4 and Handouts 3 and 4 to be useful for this assignment.

  1. Using the TV data from last assignment, fit a regression model in which the answer to Question 4 (price willing to pay for cable TV) is the dependent variable, and the independent variables are location (represented by indicator dummy variables with City the reference category), value of home, number of TV sets in household, and Total TV hours watched last week.
    1. Make sure you know what the default output means. For example: "Controlling for Value of home, Number of TV sets in household, and Total TV hours watched last week, is there a difference between the Rural and City locations in average price willing to pay for cable TV?" My test statistic was t = 2.45. For which location is predicted Y greater, and how can you tell from the regression coefficient (that is, from b)?
    2. Perform a custom F-test for comparing the 3 locations controlling for the quantitative variables. What is the value of the test statistic (a single number)? What is the p-value? Is it significant at the 0.05 level?
    3. Perform a custom F-test for comparing the Rural to Small Town controlling for the quantitative variables (there's no t-test for this one). What is the value of the test statistic (a single number)? What is the p-value? Is it significant at the 0.05 level?
  2. Now fit the model again, but this time include interaction terms that allow the slope for each quantitative independent variable to depend upon location (controlling for other variables). My answer has six product terms.
    1. Please make a table (3 rows) showing E(Y|X) for each location, so you can see what the regression coefficients mean.
    2. Perform a custom F-test for all the interaction terms at once. The null hypothesis is that for each quantitative independent variable, the three slopes are equal. What is the value of the test statistic (a single number)? What is the p-value? Is it significant at the 0.05 level?
    3. Perform a custom F-test for equality of slopes just for Value of home. Are the three slopes equal? What is the value of the test statistic (a single number)? What is the p-value? Is it significant at the 0.05 level?
    4. Perform a custom F-test for equality of slopes just for Total number of TV hours watched. Are the three slopes equal? What is the value of the test statistic (a single number)? What is the p-value? Is it significant at the 0.05 level?
    5. Perform a custom F-test for equality of slopes just for Number of TV sets. Are the three slopes equal? What is the value of the test statistic (a single number)? What is the p-value? Is it significant at the 0.05 level?
    6. Using proc iml, calculate the estimated slope of the line relating Total number of TV sets to Price willing to pay for cable TV
      1. For Location = Rural
      2. For Location = Small Town (I get -0.24934)
      3. For Location = City
      Note that you can check your answer by running a regression with just the quantitative independent variables, using by location; to do it separately for each location.
    7. Looking at the answer to the last question, you just have to do the following tests. In each case, be able to give the numerical value of the test statistic, the p-value, whether it's significant, and if so what it means. You can't possibly do these without looking at your table. Remember, in the test statement you are really specifying null hypotheses, with the name of the variable standing for the corresponding regression coefficient.
      1. Controlling for value of home and Total number of TV hours watched, is there a significant relationship between Number of TV sets and Price willing to pay just in the Rural location? (I'm asking if one of the slopes is different from zero. My p-value is 0.7504)
      2. Controlling for value of home and Total number of TV hours watched, is there a significant relationship between Number of TV sets and Price willing to pay just in the Small Town location? (I'm asking if another of the slopes is different from zero. My p-value is .5481)
      3. Why don't you need a custom test of slope for the City location? The value of my test statistic is 7.25.
      4. Controlling for value of home and Total number of TV hours watched, are the slopes relating Number of TV sets to average Price willing to pay different for Rural and Small Town locations? My p-value is 0.5147.

Please bring your log file and your list file to the quiz.