Methods of Applied Statistics Fall 2012 Final
Exam Information
Time and Location
The final exam will be on Monday December 10th from 9 a.m. to 12 p.m. in Ramsey Wright 117
Office Hours for the Final
- Tuesday Dec 4th 11am-2pm
- Thurs Dec 6th 11am-2pm
- Friday Dec 7th 11am-12noon
Miscellaneous
- There is a separate exam for graduate students and
undergaduates. The grad student exam is a bit harder, but not much.
- Quiz solutions will be posted after office hours on
Dec. 4th. After that time, there will be no further discussion of
how quizzes were marked.
- There will be an optional class on Friday Dec. 7th. I will
talk about random effects models and computer-intensive methods
including permutation/randomization tests. There will be no quiz,
and the material will not be on the final exam.
Format
You will write your answers on the question paper. The exam will be closed
book and closed notes. You should bring a calculator with a natural
log/exponential function. Any kind is acceptable unless it has
communications capability. Pencil is okay.
There are 8 questions, occupying 13 pages not including the cover
page. Most of the questions have more than one part. The questions are not
equally difficult, and not equally time-consuming. The questions on
homework assignments and quizzes are a good indication of what to expect.
Some of the questions (worth 21 points out of 100) include R printouts,
and you are asked to answer typical questions about them. But this time
it's my output rather than yours. The data sets I will use are a small
subset of those you have seen in lecture and homework.
There will be no SAS.
Coverage
The final exam is cumulative. What you are supposed to be
able to do is indicated by the assignments. The text and lecture overheads
are intended to help you understand how to answer questions like the ones
in the assignments.
Not all parts of the course are equally represented. Mostly this is
because otherwise the exam would have been too long. Here are some details.
- The formula sheet will be provided with the exam. Any other formulas I think you might need (like the moment-generating function of the 2-parameter Weibull distribution -- just kidding!) will be provided as part of the question.
- The linear algebra material on Assignment 1 was just for review,
and will not be on the exam.
- For any method, know what the model parameters mean! This
is the connecion between reality (data) and the statistical model. If
you can do this, you will be able to answer questions like
- State the model
- State the null hypothesis
- State the conclusion in plain language
- Use parameter estimates for prediction
- Give the test statistic that helps answer a concrete question.
- Any power analysis will use the non-central chi-square or
non-central F, and not the kind of elaborate calculations involving the
standard normal distribution that you saw in the introductory slide
set.
- You will not be asked to give definitions, but vocabulary is important. Know what the technical terminology means.
- But also be able to step away from technical vocabulary. The ability to describe results in plain, non-statistical language is important, and it will be asked.
- Early versions of the exam were much too long. The exam is very predictable, but some things you would expect to see do not appear because they were cut out.
I will mark the final examination. You might say that this section is
about my personal peculiarities -- just in the way I mark exams, of
course. It is helpful for you to know about this, so your exam-taking
strategy will not conflict with my exam-marking strategy.
The purpose of learning Statistics is so you can use statistical
methods to draw reasonable conclusions from numerical data. Often, the
first several parts of a question will ask for technical details, and the
last part will ask for a conclusion (often in plain, non-statistical
language). If the technical part is missing, it does not matter what you
conclude. Similarly, an answer that has most of the technical details right
but gets the conclusion wrong (or leaves it off, or states it incompletely)
is almost worthless, and will get few marks. On the other hand, if you make
minor technical mistakes but draw reasonable conclusions from what you
have, you can still get substantial marks.
When I read an answer, my main goal is to verify that you know what's
going on. Here are some more details, mostly about what to avoid.
- Make sure you answer the question that is asked.
- If you answer another question instead of the one
that's asked, you will lose substantial marks. It is especially risky to
just dump memory and answer a similar question from one of the
assignments. If I detect this, you will get a zero for the
question. Thinking is what's important. Memory without thinking is a crime
that you should try to hide if you do commit it.
- If you answer the question and also write something
correct that is not asked, you will not get any extra
marks. Your marks will be based on your answer to what is
asked.
- However, if you say something off-topic that is wrong,
you can definitely lose marks. To repeat, if you write a
perfect answer to the question that is asked, and also
write something incorrect, you will lose marks.
- Vocabulary is important. A large part of this course is about
communication. You must be able to deal with the subject matter
using both technical terms and plain language.
- Some questions on the final may ask you to state results "in
plain, non-statistical language." Please do not ignore the request
for plain language. Regardless of what you say, if plain language
is requested then you will get zero marks if you mention the null
hypothesis, or use any statistical or technical terms like logistic
regression, log-linear model, independence, positive relationship,
controlling for, and so on. Even the word "significant" (without
"statistically") should be avoided; it's borderline.
- It is also very important in describing a set of findings to
say what happened! For example, do not just say that
the average amount of rot in potatoes was related to
temperature. Instead, say that there was more rot on average at
warmer temperatures.
In a real-world situation (and in the artificial world we
presently inhabit, too), you don't get part marks for an answer that
(correctly) indicates a relationship is present, but does not say what it
is. Imagine you are working in marketing, and you leave a voice mail that
says "Consumers recalled one of the commercials better than the other one."
Click. Are you trying to frustrate your boss? Are you trying
to get fired?
- Some professors mark by looking for the correct answer, or
part of it. If they find something good, you get points for
it. This can encourage a kind of shotgun strategy for writing
answers. Just write everything you can think of, and maybe some of
it will be what this peculiar individual is looking for.
But that strategy backfires when I mark an exam, because
(except for simple numerical answers) I usually do not give marks for
things that are correct; I take off marks for things that are wrong or
missing. So, if a student writes a long answer that includes the correct
conclusion, the wrong conclusion (based on the same information!) and
something irrelevant, all I really see is the contradiction between the two
conclusions, and I will probably give the answer a zero. Yet it might be
that the student understands everything perfectly, but is just writing all
the crazy stuff as insurance against the unlikely possibility that maybe
that's what I am looking for. Let's make sure that you don't fall
into this trap!