STA429/1007 Project

Due Dec. 9th



Regardless of what the course outline may say, every student, graduate or undergraduate, may chose between doing a project and taking the final exam. No one has to do both.

Frankly, my preference is that you do a project based on someting in your field rather than taking the final exam, because you may learn better if you are working on data you care about. But of course you may take the final if you wish.


First, locate a data set in your field. If you have a faculty advisor, he or she may be able to help. If this fails, I may be able to help. It is recommended but not required that you discuss the project with me at least briefly before you do a lot of work.

Analyze your data using some (many?) of the statistical methods covered in this course. Unless there is a very good reason for using other software and we have explicitly agreed on this, please use SAS. Come to some conclusions. Write a cover section describing the data, where it came from and what you did with it. Make your conclusions explicit. This section not have to be polished; I am certainly not asking for a professional paper with an interodiction, method, results, discussion and references. But it should be clear and readable by a non-specialist (me). Length is is a minimum of one typed page and an absolute maximum of 5.

Attach the log and list files, showing all the analyses you mention in your cover section -- that is, the analyses upon which your main conclusions are based.

Typically, you will do a lot more analyses than you report. If this is the case, please attach another appendix. This one should be just a printout of the program file or files that did the unreported analyses. That's the program file(s), and explicitly not the log or list files. No comment is necessary or desirable, though a title (even hand written) on this last appendix would be appreciated.

I expect that most projects will be analyses of real data sets. But this need not be the case. For example, if you are a big fan of some other statistical software, your project could consist of re-doing a set of data analysis assignments using your software of choice. Let me be more explicit. Using both SAS and the other software,

  1. Assignment 3: Just the one frequency distribution.
  2. Assignment 4: Just produce the one F-statistic from Question 4. Don't bother with the hand-calculations of proportion of variation, or stating any conclusions.
  3. Assignment 5: Produce the ANOVA summary table for the two-way Time by Drug analysis (both main effects and the interaction), and do the custom tests from Question 8 any way you can (not necessarily with indicator dummy variables and no intercept). Good luck. This is an area where SAS is very strong and flexible compared to many other programs.
  4. Assignment 6: Just test the association between Race of Victim and Death Penalty, controlling for Race of Prisoner. Hand write the sum (that pooled chi-square) on your printout.
  5. Assignment 7: Do the tests for truck and case(truck), and also estimate the variance components any way you can. Don't worry if you cannot do the maximum likelihood method.
  6. Following Assignmments: I supppose I am going to have to keep adding to this part for the rest of the term. But if you want to do this final project, start by doing all the stuff above. Do not be concerned if you cannot do it all, but you should be able to do most of it if your favourite stat package is really that great.
Naturally, you will have done all the tasks already wth SAS. Just attach portions of your list files to the corresponding outputs from your favourite stat package, circling and labeelling the test statistcs so I can verify that you are getting the same results.

Other projects are possible, even for those without their own data.