STA312f23 Final Exam
This information applies only to the regular final exam, not the special deferred exam.
Time and location
The final exam will be on Saturday December 9th from 9am - 12 noon in KN 137. Questions are like the quiz questions and the homework.
Format
The exam has 9 questions, most with parts a, b, c and so on. It is 15 pages including the cover sheet and my R input and output. You will write your answers on the question paper. There will be a separate
formula sheet. Keep a copy of the formula sheet handy as you prepare for the exam.
Homework
This course is mostly about the homework. The homework tells you what I want you to be able to do. Lecture material is only useful to the extent that it helps you do the homework. If the text and other readings are helpful too, that is great.
To study for the final, I recommend that you
- Ignore Assignment One, which was review. You will have do do some of this stuff, but it is covered in later assignments.
- Do the sample questions presented in lecture; treat them as problems with solutions.
- Re-do the non-R parts of the homework.
- For each assignment, locate the corresponding lecture slides. This is mostly indicated on the course website.
- Look at the lecture slides and the homework problems together. Observe how most of the homework problems are asking you to use some concept or method from the lecture.
- Re-do the problems, referring to your earlier answers
- If you do not understand what a problem means or what it is asking you to do, this means you should find out. You are missing something, and it could be on the final exam.
- Using R, do something reasonable with the data sets described in the next section.
R
The R part of the final exam will be worth 24 marks out of 100. You will not write any R code on the exam, and you will not bring your printouts to the exam. You will answer questions based on my analyses, using at least one of the following data sets.
- bfeed in the KMsurv package
- The liver data: In the liver disease data, patents were randomly assigned to one of two drugs, or to a placebo. The data file includes age and sex (1=F). Blood platelet count was recorded for each patient in each time period. The data are available at
http://www.utstat.toronto.edu/brunner/data/legal/liver.data.txt
What should you do? In my opinion, more or less what you did on the R part of the homework. Also see the R lecture displays. There is more than one "right" answer, so beware of being persuaded by your friends. Think for yourself. The important thing is to become familiar with the data sets, try some analyses, and understand the results.
Material on model diagnostics (lecture units 26 and 27) will appear only in the R part of the final exam, if it appears at all. Note that if I did not do something in the R lectures, it is very unlikely that I would do it on the final exam. I am trying to be predictable here, I hope there will be no surprises.
Office hours
- Tuesday Dec. 5th 7-9 pm, online (Marija)
- Thursday Dec 7th 1-3 (Jerry)
- Friday Dec 8th 1-3 (Jerry)
More suggestions and comments
- Formula Sheet: Be familiar with what's on it. This can save you a lot of time. The rule is that you can use anything you are not being directly asked to prove.
- Bring a calculator with log and exponential functions. Make sure it's the natural log, not log base 10.
- Stating the null hypothesis: In this course, null hypotheses do not have greater than or less than signs. The significance level is always α = 0.05.
- Past final exams: Avoid them. This is a topics course, and earlier versions of STA312 were on different topics.
- Plain language conclusions: at least one question will ask you to state conclusions in "plain, non-statistical language." Here are some guidelines.
- Be guided by the 0.05 significance level, but never mention it. If you do, you get a zero even if what you say is correct.
- Any use of statistical vocabulary such as p-value, null hypothesis, significance etc. will get you a zero. Instead of saying ``controlling for," say ``allowing for," or ``correcting for," or ``taking into account." The phrase ``controlling for" will not get you a zero, but please avoid it when talking to non-statisticians.
- If a directional conclusion is posible, make it. Don't say ``Survival time was related to sex." Say ``Women tended to live longer."
- If a test is not significant , do not say there was no effect, or no difference. Avoid accepting the null hypothesis, or implying that you accept it. Say ``There was no evidence that surgery was related survival time," or ``These results do not provide evidence of a connection between marital status and time required to graduate," or something like that.
- For any explanatory variable that was not randomly assigned, avoid language that suggests influence, or causal connection. Say ``Patients with a health club memberships were at less risk for heart attack," not ``Exercise prevented heart attacks."
- Emphasis: In terms of point value, the emphasis is probably on the middle part of the course.
This document is licensed under a Creative Commons Attribution-ShareAlike 3.0 (or later) Unported License. The basketball data are protected by the Creative Commons license too.