STA312f10 Final Exam Information


Time and Location

The final exam will be on Thursday Dec 9th from 8-11 a.m. in the Cafeteria (South Building).

Office Hours:

Review slides (One part accidentally repeated near the end.)

Aids Allowed

Calculator (Statistical calculator allowed) and formula sheet. The formula sheet will be supplied. Click here for a copy of the formula sheet that will be supplied with the exam. You will notice that the error caught in class has been corrected.

Make sure that the calculator you bring has natural log and exponential functions. Be careful not to use the log key on your calculator; almost certainly it gives log base 10. In this course, log means natural log, which is probably ln on your calculator.

Format

It's a three-hour exam. You will write your answers on the examination paper. There are six questions. Most of the questions have more than one part. The questions are not equally difficult, and not equally time-consuming. The questions on assignments and quizzes are a good indication of what to expect. Also see more detailed information below.

Preparing for the exam

I see three main ways to study for the exam: reviewing your answers to homework assignments, reviewing the lecture displays (the overheads), and doing one final set of data analyses.

Reviewing the Homework

For the exam, some homework problems are more important than others. If you know how to do the following, you will be fine.

Final Computer Assignment

Seventy marks out of 100 are based on answering questions about pieces of computer printout from R and SAS. Log-linear models and Poisson regression (if there is any) will be done with R, and logistic regression will be done with SAS. Everything will be based on the heart data used in Assignment 10. The file heartread.sas has been slightly expanded; you should download the most recent version and work from that. Here are some details and suggestions.
  1. Log-linear models: In Assignment 10, you found two useful predictors of heart attacks, and they were both categorical. This gives you a nice 3-dimensional table to analyze with log-linear modeling methods. To get the data into R, I suggest you not struggle with the raw data, which has missing values and could be a real pain with R. Instead, make a three-dimensional table with proc freq, and put it into R mannually. Find a good best model. For any model you fit, be able to decribe the model in bracket notation, and using language like "Television is associated with traffic accidents," and so on. When you test the model, be able to state the null hypothesis in symbols. Once you have the model you like most, be able to say what is going on in plain language, like "Those who watch a lot of Television have fewer traffic accidents."

    Look at some 2-dimensional marginal tables. There certainly will be a 2x2 table. A lot of questions can be asked about a 2x2 table. Could you label the output as in Log-linear Part 2?

  2. Logistic Regression with a 2-category outcome: With the heart data, consider models not just for heart attack (attack), but for presence versus absence of coronary heart disease (chd), and whether the person was alive 10 years later (alive). Potential predictor variables will be limited to age diastol cholest bmi smoker famhist educat.
  3. Logistic Regression where the outcome has more than 2 categories. The variable you will see is named outcome. Make a frequency distribution to see what it is. On the exam, the reference category will be "Alive 10 yrs later." Find a nice model. Again, the predictor variables will be limited to age diastol cholest bmi smoker famhist educat. Try generating some simple models too, and be sure you can estimates of quantities like . β12 and so on. For all tests , be able to state the null hypothesis for all tests, in symbols, and bee able to state conclusions in plain, non-technical language. You will not see ordered categories on the final exam.

Here are a few more comments.