STA2101 Project


Overview

Graduate students in STA2101 are required to do a short project in addition to taking the midterm test and final examination. The report on the project is expected to be about 5 typeset (or typed) pages in length, plus appendices showing how you did the work. A pdf of the project is due by midnight before midnight on the day of the final exam. Yes, this means you can wait until after you take the final exam to start the project, but it's not really recommended. Please do not email the project to me before the day of the final exam.

Here are some possibilities.

  1. Learn SAS and use it to analyze a data set. Full details are given below.

    SAS (the Statistical Analysis System) is a strong old statistical software package. It is losing ground to R, but it is still used widely in the biomedical research sector and banks. It is a valuable job credential. You will need to learn it on your own, any way you can. I have some old class materials that may be helpful.

  2. Do the SAS assignment, but using Python. If you want to use a software option other from SAS and Python, please consult with me first. R is not acceptable.
  3. Learn about a new statistical method that interests you. Your report will have 3 sections. This project may be a bit longer than the others (more pages), but appendices are not necessary.
    1. Description of the method in your own words. Include a clear statement of the model, with typeset formulas. Write for an audience (me) who knows statistics, but is unfamiliar with the topic.
    2. Simulate a data set with R, based on the model in Part 1. Briefly describe the data set in words, and give the R code. Also, please display the first few lines of the data file. The unknown parameters in the model will have numerical values. At the end of this section, list the parameters along with their numerical values.
    3. Using any software of your choice (R is okay), estimate the model parameters. Give a table showing the true parameter values, and the estimates (numbers). If testing and confidence intervals are appropriate, carry out a few tests and produce at least one confidence interval.
  4. Design another project of your choice. Please discuss with me first. I have already had one such discussion that lasted less than 10 seconds. The student said "Text mining with SAS" and I said "Okay, that's great!"


SAS Project

This option is to learn SAS and use it to analyze some data. You may supply your own data set if you wish, but it must be rich and interesting. The default data set is from a longitudinal clinical trial of an interactive, multimedia program known as "Beat the Blues" designed to deliver cognitive behavioural therapy to depressed patients via a computer terminal. Patients with depression recruited in primary care were randomised to either the Beating the Blues program, or to "Treatment as Usual" (TAU). (This isn't my writing; I seem to have lifted it from somewhere and I don't even believe all of it.) The variables are

The data are available in the file BeatTheBlues.data.txt.

Your task is to analyze the data and write a brief report. Here are some guidelines. I may add to this list based on comments and questions.