STA 2101H: Methods of Applied Statistics I
Wednesday September 15 to Wednesday December 8 2021
10am -- 1 pm Eastern
==============================================================
Office Hours
Monday 7 - 8.30 pm; Wednesday 4 - 5.30 pm; on ZoomDelivery
Lectures 1 and 2, September 15 and 22 are online only. From September 29 we are scheduled to meet in WI 1017 in person. The slides for the lectures will be posted, on good weeks before the scheduled course time, and on rushed weeks just after.The first hour will usually be mainly lecture-style, with breaks for discussion, on the methods listed in the Syllabus. The second hour will be discussion of case studies, usually from current events, with statistical concepts reviewed as needed in that context. The third hour will be a discussion of computational methods and/or problems, questions about the course material and questions about the homework.
We will use Piazza for discussion, as it is now integrated with Quercus. You will find an entry for Piazza in the course menu. If you click it, you will be asked to sign up.
December 8
- Offline lecture on nonparametric regression
- Slides (updated Dec 7 18.00)
- Slides with scribbles
- Toxoplasmosis html Rmd
- R script for nonparametric regression (updated to include wavelet example)
- NMMAPS paper JRSS A 2006
December 1
- Slides (updated Dec 1 14.00)
- Slides with scribbles
- GLM examples pdf Rmd
- R script for nonparametric regression
November 24
- Slides (Nov 23 6.30 pm)
- Slides (with scribbles)
- See case-controls studies handout from November 17
- The soccer example came via 538. The slide deck is here
- HW Week 10 pdf Rmd
November 17
- Slides (updated Nov 17 13.30)
- Slides (with scribbles)
- R Markdown for overdispersion html (posted Nov 16)
- See also updated binary regression from Nov 3
- Handout on case-control studies
- HW Week 9 pdf Rmd
November 3
- Slides (posted Nov 3 8.00)
- Slides with scribbles
- R Markdown for binary regression html (updated Nov 16)
- HW Week 8 pdf Rmd
October 27
- Slides (slightly updated Oct 27)
- Slides (wihth scribbles)
- R Markdown for factorial designs html
- R script for shuttle data
- R Markdown for heart disease example html
- Science Advances article on citation rates and replicability
- MedrXiv paper large survey of post-covid symptoms in Germany
- HW Week 7 pdf Rmd
October 20
- Slides (updated Oct 20 9.45)
- Slides with (scribbles)
- Updated Syllabus
- Section 15.5 of 2nd edition of LM
- Web link to WHO Africa's report
- Press Briefing from WHO Africa
- HW Week 6 pdf Rmd
- Article and Appendices for HW 6
October 13
- Slides (slighly updated Oct 13)
- Slides (with scribbles)
- R Markdown for fruitfly example LM-2 14.4; LM-1 13.3
- Output for fruitfly example (html)
- Owen's slides on tie-breaker designs
- Owen & Varian paper on tie-breaker designs
- Economist story on poverty and religiosity. (Might be behind a paywall if you're not on campus.)
- PNAS paper for the Economist story.
- Angrist et al paper on incentive scheme for undergraduate study.
- HW Week 5 pdf Rmd
October 6
- Slides (posted Oct 5)
- Slides (with scribbles)
- R Markdown and html for the nuclear power plant data. (You won't be able to knit until you remove the code asking to print the image files.)
- HW Week 4 pdf Rmd
September 29
September 22
- HW Week 2 pdf Rmd
- Article for HW 2 And Appendices
- Slides (posted Sep 21)
- Slides (posted Sep 22, without blank spaces) (Slide 26 updated Sep 27)
- Slides (with scribbles)
- R output and R input for comparing models, and looking at factors, with the prostate data
- History section of 2nd edition of Faraway's Linear Models
- Guardian article about ivermectin study, thanks to Liam. From the article:
A medical student in London, Jack Lawrence, was among the first to identify serious concerns about the paper, leading to the retraction. He first became aware of the Elgazzar preprint when it was assigned to him by one of his lecturers for an assignment that formed part of his master's degree. He found the introduction section of the paper appeared to have been almost entirely plagiarised.
September 15
- HW Week 1 pdf Rmd
- Slides (posted Sep 14)
- Slides with scribbles posted Sept 15 after class
- R Markdown for prostate example on slides
- Economist modelling page
Before first lecture
Before the first class, I recommend that you- Read the blog post here
- Read Chapter 2 Sections 1 - 4 of Faraway's "Linear Models with R"
- check that you can link to to my course web page, the Quercus page, and to the reference texts (if you don't have a hard copy)
- signup for Piazza
- download and install R and RStudio
Required references
- Linear Models with R by J. Faraway*
- Extending the Linear Model with R by J. Faraway*
* The library, and our Quercus page, has the First Editions. If you are buying the texts, I recommend the Second Editions. I will try to give references to both versions as we go along. - Principles of Applied Statistics by D.R. Cox and C.A. Donnelly
Background references
- Statistical Models by A.C. Davison. (especially recommended for PhD students) If Davision is a bit heavy, your undergraduate regression textbook may be helpful, or
- Data Analysis and Graphics using R, by Maindonald and Braun
- An Introduction to Generalized Linear Models by Dobson.
Computing
I will always refer to the R computing package and I highly recommend the RStudio environment. You will need to install both of these on your laptop. I am using Version 4.1.1 of R, and Version 1.4.1717 of Rstudio. You can download R from https://cran.r-project.org/ and the free Desktop Version of Rstudio from https://rstudio.com/products/rstudio/\#rstudio-desktop.I also strongly recommend using R Markdown to prepare your homework, but you can use LateX or Word if you must. For questions involving computing you will need to submit working code. This is easy in R Markdown, but R scripts will also be accepted.
There are many online resources for R and Rstudio. If you are new to R, you could look at Quick-R. Rstudio has some recommendations on their education page. For more experienced users, the Cheatsheets are invaluable.