About the Final Exam

Methods of Applied Statistics Fall 2012 Final Exam Information

Time and Location

The final exam will be on Monday December 10th from 9 a.m. to 12 p.m. in Ramsey Wright 117

Office Hours for the Final

Tuesday Dec 4th 11am-2pm
Thurs Dec 6th 11am-2pm
Friday Dec 7th 11am-12noon

Miscellaneous

There is a separate exam for graduate students and undergaduates. The grad student exam is a bit harder, but not much.
Quiz solutions will be posted after office hours on Dec. 4th. After that time, there will be no further discussion of how quizzes were marked.
There will be an optional class on Friday Dec. 7th. I will talk about random effects models and computer-intensive methods including permutation/randomization tests. There will be no quiz, and the material will not be on the final exam.

Format

You will write your answers on the question paper. The exam will be closed book and closed notes. You should bring a calculator with a natural log/exponential function. Any kind is acceptable unless it has communications capability. Pencil is okay.

There are 8 questions, occupying 13 pages not including the cover page. Most of the questions have more than one part. The questions are not equally difficult, and not equally time-consuming. The questions on homework assignments and quizzes are a good indication of what to expect.

Some of the questions (worth 21 points out of 100) include R printouts, and you are asked to answer typical questions about them. But this time it's my output rather than yours. The data sets I will use are a small subset of those you have seen in lecture and homework. There will be no SAS.

Coverage

The final exam is cumulative. What you are supposed to be able to do is indicated by the assignments. The text and lecture overheads are intended to help you understand how to answer questions like the ones in the assignments.

Not all parts of the course are equally represented. Mostly this is because otherwise the exam would have been too long. Here are some details.

The formula sheet will be provided with the exam. Any other formulas I think you might need (like the moment-generating function of the 2-parameter Weibull distribution -- just kidding!) will be provided as part of the question.
The linear algebra material on Assignment 1 was just for review, and will not be on the exam.
For any method, know what the model parameters mean! This is the connecion between reality (data) and the statistical model. If you can do this, you will be able to answer questions like
- State the model
- State the null hypothesis
- State the conclusion in plain language
- Use parameter estimates for prediction
- Give the test statistic that helps answer a concrete question.
Any power analysis will use the non-central chi-square or non-central F, and not the kind of elaborate calculations involving the standard normal distribution that you saw in the introductory slide set.
You will not be asked to give definitions, but vocabulary is important. Know what the technical terminology means.
But also be able to step away from technical vocabulary. The ability to describe results in plain, non-statistical language is important, and it will be asked.
Early versions of the exam were much too long. The exam is very predictable, but some things you would expect to see do not appear because they were cut out.

More comments and suggestions

I will mark the final examination. You might say that this section is about my personal peculiarities -- just in the way I mark exams, of course. It is helpful for you to know about this, so your exam-taking strategy will not conflict with my exam-marking strategy.

The purpose of learning Statistics is so you can use statistical methods to draw reasonable conclusions from numerical data. Often, the first several parts of a question will ask for technical details, and the last part will ask for a conclusion (often in plain, non-statistical language). If the technical part is missing, it does not matter what you conclude. Similarly, an answer that has most of the technical details right but gets the conclusion wrong (or leaves it off, or states it incompletely) is almost worthless, and will get few marks. On the other hand, if you make minor technical mistakes but draw reasonable conclusions from what you have, you can still get substantial marks.

When I read an answer, my main goal is to verify that you know what's going on. Here are some more details, mostly about what to avoid.

Make sure you answer the question that is asked.
1. If you answer another question instead of the one that's asked, you will lose substantial marks. It is especially risky to just dump memory and answer a similar question from one of the assignments. If I detect this, you will get a zero for the question. Thinking is what's important. Memory without thinking is a crime that you should try to hide if you do commit it.
2. If you answer the question and also write something correct that is not asked, you will not get any extra marks. Your marks will be based on your answer to what is asked.
3. However, if you say something off-topic that is wrong, you can definitely lose marks. To repeat, if you write a perfect answer to the question that is asked, and also write something incorrect, you will lose marks.
Vocabulary is important. A large part of this course is about communication. You must be able to deal with the subject matter using both technical terms and plain language.
Some questions on the final may ask you to state results "in plain, non-statistical language." Please do not ignore the request for plain language. Regardless of what you say, if plain language is requested then you will get zero marks if you mention the null hypothesis, or use any statistical or technical terms like logistic regression, log-linear model, independence, positive relationship, controlling for, and so on. Even the word "significant" (without "statistically") should be avoided; it's borderline.
It is also very important in describing a set of findings to say what happened! For example, do not just say that the average amount of rot in potatoes was related to temperature. Instead, say that there was more rot on average at warmer temperatures.
In a real-world situation (and in the artificial world we presently inhabit, too), you don't get part marks for an answer that (correctly) indicates a relationship is present, but does not say what it is. Imagine you are working in marketing, and you leave a voice mail that says "Consumers recalled one of the commercials better than the other one." Click. Are you trying to frustrate your boss? Are you trying to get fired?
Some professors mark by looking for the correct answer, or part of it. If they find something good, you get points for it. This can encourage a kind of shotgun strategy for writing answers. Just write everything you can think of, and maybe some of it will be what this peculiar individual is looking for.
But that strategy backfires when I mark an exam, because (except for simple numerical answers) I usually do not give marks for things that are correct; I take off marks for things that are wrong or missing. So, if a student writes a long answer that includes the correct conclusion, the wrong conclusion (based on the same information!) and something irrelevant, all I really see is the contradiction between the two conclusions, and I will probably give the answer a zero. Yet it might be that the student understands everything perfectly, but is just writing all the crazy stuff as insurance against the unlikely possibility that maybe that's what I am looking for. Let's make sure that you don't fall into this trap!