STA429/1007 Assignment 9

Quiz on Tuesday Nov 30th


This assignment is based upon lecture material on logistic regression with a dependent variable having more than 2 categories. Please bring log and list files to the quiz.

The file heart contains data from a study following middle-aged male employees of the Western Electric Company in the 1950's. The first part of the file gives descriptions of the variables. This part should be stripped off or skipped using the firstobs option on the infile statement.

After reading the data and making sure everything is okay, please create a new variable called "outcome." It will have 3 categories:

Outcome will be your dependent variable. For interpretability, make the probability of being alive 10 years later the denominator in each generalized logit.

To check, make tables of OUTCOME by FIRST CORONARY HEART DISEASE EVENT and ALIVE 10 YEARS AFTER ENTERING STUDY. Make sure these tables are the first item in your list file.

With outcome as your dependent variable, do a likelihood ratio test for diastolic blood pressure and cholesterol level (considered simultaneously -- one test), controlling for age, number of cigarettes, height, weight, and family history of coronary heart disease. Use proc iml to calculate the test statistic G and the p-value. What are your degrees of freedom? That is, how many ß coefficients are zero under the reduced model?

That's the only likelihood ratio tet you have to do.

Finally, be able to interpret all the parameter estimates and Wald chisquare tests (except those for the intercepts) under "Analysis of Maximum Likelihood Estimates" and "Maximum Likelihood Analysis of Variance" for the full model.