STA441s18 Final Exam
This information applies only to the regular final exam, not the special deferred exam.
Time and location
The final exam will be on Tuesday April 17th in Davis Gym A/B, from 1-4pm. Questions are like the quiz questions, and as you know, the quiz questions are like the homework.
Format
The exam has 9 questions with parts a, b, c etc. occupying 10 pages including the cover sheet. You will write your answers on the question paper. There will be a separate packet with the
formula sheet, my SAS programs and my Results files. Keep a copy of the formula sheet handy as you prepare for the exam. Twenty-seven marks out of 100 are based on my SAS programs and results.
Homework
This course is all about the homework. The homework tells you what I want you to be able to do. Lecture material is only useful to the extent that it helps you do the homework. The text may help too. It is less focused on what we are doing this time, but it is more detailed.
To study for the final, I recommend that you
- Re-do the non-SAS parts of the homework.
- For each assignment, locate the corresponding lecture slides. They are pretty much in chronological order (order of time). If this is a difficult task, you are not familiar enough with the course material.
- Look at the lecture slides and the homework problems together. Observe how most of the homework problems are asking you to use some concept or method from the lecture. Of course sometimes I just want you to think about something, but most questions have a lesson.
- Re-do the problems, referring to your earlier answers
- If you do not get what a problem means or what it is asking you to do, this means you should find out. You are missing something, and it could be on the final exam.
- Using SAS, do something reasonable with the final data sets described below. What's reasonable? In my opinion, more or less what you did on the SAS part of the homework. However, there is more than one "right answer." The important thing is to become familiar with the data sets, try some analyses, and understand the results. You will not bring your output to the exam. Questions will be based on my output.
Office hours
- Jerry: Wed. April 11th 11-1 in DH-4001.
- Asal: Thursday April 12th from 10-12
- Fri. April 13th 11-1 in DH-4001.
- Mon. April 16th 11-1 in DH-4001.
Quiz solutions (but no SAS code)
If you are in negotiation with Asal about your marks on one of the quizzes, that negotiation may continue. However, now that the answers are posted, there will be no new discussions of the marking. The reason should be obvious.
2016 Final exam
STA441 was taught by someone else in 2014 and I don't know what he did. Ignore old downtown STA442 exams. I teach that course and there is some overlap, but STA442(G) is a joint graduate-undergraduate course and it is more technical than STA441.
Data sets
Exam questions worth 27 points will be based on my SAS output for at least two and at most four of the following data sets. Try some analyses. Look up any terminology that is unfamiliar, or you can ask in office hours (but why wait?). Understand what the variables are, because we will not be answering questions about the data sets during the exam. What I will do with the data is very predictable.
- The Birth Weight Data come from a sample of mothers who recently had a baby. The variables are
- Indicator of birth weight less than 2.5 kg. This is clinically meaningful because babies who weigh less than 2,500 grams tend to have health problems.
- Mother's age in years.
- Mother's weight in pounds at last menstrual period.
- Mother's race (1 = white, 2 = black, 3 = other).
- Smoking status during pregnancy.
- Number of previous premature labours.
- History of hypertension.
- Presence of uterine irritability.
- Number of physician visits during the first trimester.
- Baby's birth weight in grams.
I will use the variables names given in the first line of the data file.
- The Diet Data are from a study of people trying to lose weight. Variables are
- Identification code
- Gender 0=F, 1=M which is not the way I do it
- Age in years
- Height in cm.
- Weight in kg. before starting the diet.
- Diet 1, 2, or 3, randomly assigned
- Weight in kg. after 6 weeks on the diet
These data are in an Excel spreadsheet, and column headers will automatically be the variable names.
-
The Program Choice data: Incoming high school students
choose their programs of study. Variables are
- Gender: 0=Male, 1=Female
- Socioeconomic status: 1, 2, 3
- Math score
- Reading score
- Science score
- Social studies score
- Writing score
- Program choice: 1=general, 2=academic, 3=vocational
I will use the variables names given in the first line of the data file.
- The CO2 data: The CO2 uptake of
six plants from Quebec and six plants from Mississippi was measured at
several levels of ambient CO2 concentration. Half the plants of each type
were chilled overnight before the experiment was conducted. I will use the variables names given in the first line of the data file.
-
The Basketball Data: Right handed
basketball players take right and left-handed hook shots from the three spots on the floor (left baseline, right baseline and middle), for a total of 6 shots. Hit or miss is recorded for each shot. I will use the variables names given in the first line of the data file.
Extras
Plain language conclusions: For example, see basicmath.sas in SAS Example Three.
- tables ethnic * tongue / nocol nopercent chisq;
First language was related to ethnic background.
- Ethnic by Tongue: Asians vs. Eastern Europeans
Asians were less likely than Eastern Europeans to have English as their first language.
- Ethnic by Tongue: Asians vs. Other Europeans
Asians were less likely than European not Eastern to have English as their first language.
- Ethnic by Tongue: Asians vs. Middle East
Students whose ethnic background was judged to be Asian were less likely to have English as their first language than students whose ethnic background was judged to be Middle-Eastern or Pakistani.
- Ethnic by Tongue: Asians vs. East Indian
Asians were less likely than East Indians to have English as their first language.
- Ethnic by Tongue: Asians vs. Other / DK
Asians were less likely than Other/DK to have English as their first language.
- Eastern European vs. Other European
Eastern European were less likely to have English as their first language than Europeans who were not Eastern European.
- Eastern European vs. Middle Eastern
There was no evidence of a difference between Eastern Europeans and Middle Eastern/Pakistanis in having English as their first language.
- tables (sex ethnic tongue course) * passed / nocol nopercent chisq;
These results are consistent with males and females being equally likely to pass the course.
Differences between ethnic groups in passing the course were small enough to be attributed to chance.
There was no clear evidence of a connection between first language and passing the course.
- Sex and Grade
There was no evidence of a real difference in in the final marks of male and female students.
- Ethnic and Grade
Differences in final marks between students from the various ethnic groups were small enough to be attributed to sampling error.
- Mother tongue and Grade
Students whose first language was not English got higher marks on average.
- Course and Grade
Average marks in the three courses were roughly the same.
What to say about SAS (in a job interview).
I had a course where we used SAS University Edition, so that's base SAS running in the SAS Studio environment. We read data from plain test data files using a simple form of the input statement and we used proc import to read from Excel spreadsheets. We used assignment statements and if statements to create new variables, and proc format to label the values. We used arrays and do loops in the data step. We used proc reg and proc glm for univariate and multivariate regression and analysis of variance, and we used proc logistic for regular logistic regression and multinomial logit models. We used proc mixed as well as proc glm to analyze repeated measures data when the dependent variable was assumed normal. At the end we used proc nlmixed to fit mixed logistic models when the outcome was binary and repeated measures. We used ODS to send results to proc iml for further calculations, and we also used ODS select sometimes to limit the output.
If they ask about the put statement, say "Oh, that's like a print statement for writing on the log file, but we didn't use it."
If they ask about SASgraph, say "We didn't use it. My professor said it was scary."
This document is licensed under a Creative Commons Attribution-ShareAlike 3.0 (or later) Unported License. The basketball data are protected by the Creative Commons license too.