## STA 2101H: Methods of Applied Statistics I Fall 2020

### Thursday September 10 to Thursday December 3

12pm -- 3 pm Eastern

==============================================================

### Office Hours

Monday 4.00-5.30pm, 7.00-8.00pm; Wednesday 9-10.30am EDT (BBCollaborate Course Room)### Comments on Final Homework

### Final Homework

- Questions (pdf)
- Questions (Rmd)
- Paper by Podkul et al. for Q4 (pdf)
- Web page version of Paper by Podkul et al.

### Week 12 December 3

- Slides (updated Dec 3 14.25)
- R script for nuclear plant data
- Excerpt from
*Applied Statistics*by Cox and Snell - Week 12 Recording (Library)

### Week 11 November 26

- Slides (updated Nov 26 15.00)
- R Script illustrating various smoothing methods. (Pretty kludgy)
- R Markdown file illustrating GAM (cribbed from ISLR, Ch. 7)
- html generated by above
- iPad Slides (full of scribbles)
- Week 11 Recording (Library)

### Homework 3 due Dec 3

- Questions (pdf)
- R Markdown
- Paper by Jager and Leek for Q4

### Week 10 November 19

- Lineup protocol example; please complete! You can alternatively complete it on a google form here
- Recording
- Slides (latest version posted Nov 19 15:04)
- R script for non-parametric regression
- Notes on generalized linear models
- Vaccine post on Gelman's blog
- The same blog has a lot about election predictions as well.
- Economist forecast model explained
- Nate Silver weighs in at 538.com

### Week 9 November 5

- Slides
- Annotated Slides
- R code
- Output(html)
- Recording (Library)

### Week 8 October 29

- Slides
- Slides with my markup from class
- Handout on case-control studies
- R code re overdispersion
- R code re Poisson
- Recording (Library)

### Homework 2 due November 5

- Solutions
- R Markdown for solutions
- Questions
- R Markdown for questions
- Perez-Guzman et al Paper and Supplementary Material for Q4.
- Roozenbeek et al Paper on susceptibility to covid misinformation and Supplementary information , for Q5.
- R script downloaded from Authors repository
- Data from Roozenbeek et al.; downloaded from Authors repository

### Week 7 October 22

- Slides
- R Script for Example 10.18 (SM)
- Financial Times data visualizations
- NY Times Natural Experiment
- PNAS Widespread use-and misuse-of real-world data
- Recording Part 1 (Library)
- Recording Part 2 (Library)

### October 18

- Syllabus Update 3 to reflect change re HW2

### Week 6 October 15

- Link to
*Knowable Magazine*article on election polls - Note about contrasts in analysis of variance (html) and R Markdown file
- Slides (posted Oct 14; tweaked Oct 15 11.32 am)
- Messer et al 2010 from Cox and Donnelly Ch.5.2
- Recording Part 1 (Library)
- Recording Part 2 (Library)
- Challenger data
- Official Report on Challenger, data on p. 129-131
- Dalal et al. (1986) JASA paper on data analysis
- Tufte's Book
*Visual Explanations*has a discussion of Powerpoint and the Challenger; contrary view is given here - my R script for analysis

- Violin plots are discussed in Wilke's Fundamentals of Data Viz Ch.9

### Week 5 October 8

- HW 1 solutions thanks to Sangook Kim
- R Markown for HW 1 solutions, ditto thanks to SK
- Slides
- Recording (Library)
- R Markown for randomized block example from FLM
- html for randomized block example from FLM
- Updated syllabus 2
- Significance article about grades in UK

### Week 4 October 1

(typo corrected on slide 10 on Oct 14, thanks Chenghui)- Slides
- Prostate R script
- Recording (Library)
- R code for fruitfly data
- Fruitfly html
- Original publication

### September 28

- Updated syllabus
- Slides from Weeks 2,3 have been updated to include 1st edition of Faraway's
*Linear Models with*`R`

### Week 3 September 24

- Slides
- Recording (Library)
- Figure 6.9 from the 2nd edition of FLM

### September 22

- Homework 1, due October 1.
- R Markdown document that created it.
- Html version if you prefer.
- Sep 22 slide deck from Public Health Canada

### September 21

- Short note on least squares equations and matrix algebra.

### Week 2 September 17

- Recording of 3rd hour (Library). (First two hours recorded under Blackboard Collaborate)
- Slides (typos corrected after class)
- Wildfire attribution article by Kirchmeier-Young et al introduced in Week 1 (see JMM talk)
- Supplementary material for above
- Moon-shot testing for Covid
- My calculations
- R Markown file for my calculations

### September 11

Article in the Globe and Mail this morning about the "relatively new field known as event attribution science". You read it here first :) Here are screen shots of the article (best I could do).### Week 1 September 10

- Slides (edited Sep 10 3.30 pm) (Recording in BBCollaborate)
- JMM talk
- SSC Case Study Competition details
- Piazza information sheet for Quercus
- R Markdown example that fits a polynomial
- Output from one run

### Course Information Sheet **Updated Sep 3**

### Syllabus (Update 4)

### Delivery

The class will be delivered at the scheduled time (Thursdays, 12-3 pm Toronto time) using BBCollaborate. The lectures will be recorded, for viewing offline after the scheduled time. The slides for the lectures will be posted, on good weeks before the scheduled course time, and on rushed weeks just after.The first hour will usually be mainly lecture-style, with breaks for discussion, on the methods listed in the Syllabus. The second hour will be discussion of case studies, usually from current events, with statistical concepts reviewed as needed in that context. The third hour will be a discussion of computational methods and/or problems, questions about the course material and questions about the homework.

We will use Piazza for discussion, as it is now integrated with Quercus. You will find an entry for Piazza in the course menu. If you click it, you will be asked to sign up. Please see the instructions in the handout, especially the highlighted bits.

### Before first lecture

Before the first class, I recommend that you- check that you can link to to my course web page, the Quercus page, and to the reference texts (if you don't have a hard copy)
- signup for Piazza
- download and install R and RStudio
- look at the slides for this public lecture I gave in January

### Recommended Texts

Statistical Models by A.C. Davison.Principles of Applied Statistics by D.R. Cox and C.A. Donnelly

If Davison seems a little heavy, you may prefer your undergraduate regression textbook, or

Linear Models with

`R`, and Extending the Linear Model with

`R`, both by J.J. Faraway. Electronic copies are on the Quercus page.

Other helpful references are

*Data Analysis and Graphics using*

`R`, by Maindonald and Braun, and

*An Introduction to Generalized Linear Models*by Dobson.

### Computing

I will always refer to the`R`computing package and I highly recommend the RStudio environment. You will need to install both of these on your laptop. I am using Version 3.6.3 of

`R`, although Version 4 was released in April 2020. I am using Version 1.3.1073 of Rstudio. You can download

`R`from https://cran.r-project.org/ and the free Desktop Version of Rstudio from https://rstudio.com/products/rstudio/\#rstudio-desktop.

I also strongly recommend using R Markdown to prepare your homework, but you can use LateX or Word if you must. For questions involving computing you will need to submit working code. This is easy in R Markdown, but R scripts will also be accepted.

There are many online resources for

`R`and Rstudio. If you are new to

`R`, you could look at Quick-R. Rstudio has some recommendations on their education page. For more experienced users, the Cheatsheets are invaluable.