Subjects are presented with tones at a variety of different pitch and volume levels (in a random order). They press a key when they think they hear something. \pause \item A study can have both within cases and between cases factors. \end{itemize} \end{frame} \begin{frame}{You may hear terms like} \begin{itemize} \item \textbf{Longitudinal}: The same variables are measured repeatedly over time. Usually lots of variables, including categorical ones, and large samples. If there's an experimental treatment, it’s usually once at the beginning, like a surgery. Basically it’s \emph{tracking} what happens over time. \pause \item \textbf{Repeated measures}: Usually, same subjects experience two or more experimental treatments. Usually categorical explanatory variables and small samples. \end{itemize} \end{frame} \section{Random Effects} \begin{frame} \frametitle{General Mixed Linear Model} {\LARGE \begin{displaymath} \mathbf{y}~=~\mathbf{X} \boldsymbol{\beta} ~+~ \mathbf{Zb} ~+~\boldsymbol{\epsilon} \end{displaymath} } \pause \begin{itemize} \item $\mathbf{X}$ is an $n \times p$ matrix of known constants. \item $\boldsymbol{\beta}$ is a $p \times 1$ vector of unknown constants. \item $\mathbf{Z}$ is an $n \times q$ matrix of known constants. \item $\mathbf{b} \sim N_q(\mathbf{0},\boldsymbol{\Sigma}_b)$ with $\boldsymbol{\Sigma}_b$ unknown but often diagonal. \item $\boldsymbol{\epsilon} \sim N(\mathbf{0},\sigma^2 \mathbf{I}_n)$ , where $\sigma^2 > 0$ is an unknown constant. \end{itemize} \end{frame} \begin{frame} \frametitle{Random vs. fixed effects} {\LARGE \begin{displaymath} \mathbf{y}~=~\mathbf{X} \boldsymbol{\beta} ~+~ \mathbf{Zb} ~+~\boldsymbol{\epsilon} \end{displaymath} } \begin{itemize} \item Elements of $\boldsymbol{\beta}$ are called fixed effects. \item Elements of $\mathbf{b}$ are called random effects. \item Models with both are called \emph{mixed}. \end{itemize} \end{frame} \begin{frame} \frametitle{Main application of random effects models} %\framesubtitle{} A random factor is one in which the values of the factor are a random sample from a populations of values. \pause \begin{itemize} \item Randomly select 20 fast food outlets, survey customers in each about quality of the fries. Outlet is a random effects factor with 20 values. \pause Amount of salt would be a fixed effects factor. \pause \item Randomly select 10 schools, test students at each school. School is a random effects factor with 10 values. \pause \item Randomly select 15 homeopathic medicines for arthritis (there are quite a few), and then randomly assign arthritis patients to try them. Drug is a random effects factor. \pause \item Randomly select 15 lakes. In each lake, measure how clear the water is at 20 randomly chosen points. Lake is a random effects factor. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % \section{One Random Factor} \begin{frame} \frametitle{One random factor} \framesubtitle{A nice simple example} \pause \begin{itemize} \item Randomly select 5 farms. \item Randomly select 10 cows from each farm, milk them, and record the amount of milk from each one. \item The one random factor is Farm. \pause \item Total $n=50$ \end{itemize} \pause The idea is that ``Farm" is a kind of random shock that pushes all the amounts of milk in a particular farm up or down by the same amount. \end{frame} \begin{frame} \frametitle{Farm is a random shock} % \framesubtitle{White Whale Equation 25.38, p. 1047 (almost)} {\LARGE \begin{displaymath} Y_{ij} = \mu + \tau_i + \epsilon_{ij}, \end{displaymath} } \pause where \begin{itemize} \item[] $\mu$ is an unknown constant parameter. \item[] $\tau_i \sim N(0,\sigma^2_\tau)$ \pause \item[] $\epsilon_{ij} \sim N(0,\sigma^2)$ \item[] $\tau_i$ and $\epsilon_{ij}$ are all independent. \item[] $\sigma^2_\tau \geq 0$ and $\sigma^2 > 0$ are unknown parameters. \item[] $i=1, \ldots q$ and $j=1, \ldots, k$ \pause \item[] There are $q=5$ farms and $k=10$ cows from each farm. \end{itemize} \end{frame} \begin{frame} \frametitle{General Mixed Linear Model Notation} \begin{eqnarray*} Y_{ij} & = &\mu + \tau_i + \epsilon_{ij} \\ \mathbf{Y} & = & \mathbf{X} \boldsymbol{\beta} ~+~ \mathbf{Zb} ~+~\boldsymbol{\epsilon} \end{eqnarray*} \begin{displaymath} \left( \begin{array}{c} Y_{1,1} \\ Y_{1,2} \\ Y_{1,3} \\ \vdots \\ Y_{5,9} \\ Y_{5,10} \end{array} \right) ~=~ \left( \begin{array}{c} 1 \\ 1 \\ 1 \\ \vdots \\ 1 \\ 1 \end{array} \right) (\mu) ~+~ \left( \begin{array}{c c c c c} 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 \\ \end{array} \right) \left( \begin{array}{c} \tau_1 \\ \tau_2 \\ \tau_3 \\\tau_4 \\ \tau_5 \end{array} \right) ~+~ \left( \begin{array}{c} \epsilon_{1,1} \\ \epsilon_{1,2} \\ \epsilon_{1,3} \\ \vdots \\ \epsilon_{5,9} \\ \epsilon_{5,10} \end{array} \right) \end{displaymath} \end{frame} \begin{frame} \frametitle{Distribution of $Y_{ij} = \mu + \tau_i + \epsilon_{ij}$} \framesubtitle{$i=1, \ldots 10$ cows and $j=1, \ldots, 5$ farms} {\Large \begin{itemize} \item $Y_{ij} \sim N(\mu,\sigma^2_\tau+\sigma^2)$ \item[] \item $Cov(Y_{ij},Y_{i,j^\prime}) = \sigma^2_\tau$ for $j \neq j^\prime$ \item[] \item $Cov(Y_{ij},Y_{i^\prime,j^\prime}) = 0$ for $i \neq i^\prime$ \end{itemize} } % End size \end{frame} \begin{frame} \frametitle{Classical approach: Skipping lots of details} %\framesubtitle{$Y_{ij} = \mu + \tau_i + \epsilon_{ij}$} \begin{itemize} \item Distribution theory. \item Components of variance. \item Testing $H_0: \sigma^2_\tau = 0$. \item Extension to mixed models. \item Nested effects. \item Choice of $F$ statistics based on expected mean squares. \end{itemize} \end{frame} \begin{frame} \frametitle{Repeated measures} \framesubtitle{Another way to describe \emph{within-cases}} \pause \begin{itemize} \item Sometimes an individual is tested under more than one condition, and contributes a response for each value of a categorical explanatory variable. \pause \item One can view ``subject" as just another random effects factor, because subjects supposedly were randomly sampled. \pause \item Subject would be nested within sex, but might cross stimulus intensity. \pause \item This is the classical (old fashioned) way to analyze repeated measures. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{A modern approach} \begin{frame} \frametitle{Problems with the classical approach} \begin{itemize} \item Normality matters in a serious way for the tests of random effects. \item Sometimes (especially for complicated mixed models) a valid $F$-test for an effect of interest just doesn't exist. \item When sample sizes are unbalanced, everything falls apart. \item Hard to incorporate covariates. \end{itemize} \end{frame} \begin{frame} \frametitle{A modern approach using the general mixed linear model} \begin{displaymath} \mathbf{y}~=~\mathbf{X} \boldsymbol{\beta} ~+~ \mathbf{Zb} ~+~\boldsymbol{\epsilon} \end{displaymath} \begin{itemize} \item $\mathbf{y} \sim N_n(\mathbf{X}\boldsymbol{\beta}, \mathbf{Z} \boldsymbol{\Sigma}_b \mathbf{Z}^\prime + \sigma^2 I_n)$ \item Estimate $\boldsymbol{\beta}$ as usual with $(\mathbf{X}^\prime \mathbf{X})^{-1} \mathbf{X}^\prime \mathbf{Y}$. \item Estimate $\boldsymbol{\Sigma}_b$ and $\sigma^2$ by maximum likelihood\pause, or by ``restricted" maximum likelihood. \end{itemize} \end{frame} \begin{frame} \frametitle{Restricted maximum likelihood} \framesubtitle{For the record} \begin{displaymath} \mathbf{y}~=~\mathbf{X} \boldsymbol{\beta} ~+~ \mathbf{Zb} ~+~\boldsymbol{\epsilon} \end{displaymath} %\vspace{5mm} \begin{itemize} \item Transform $\mathbf{y}$ by the $q \times n$ matrix $\mathbf{K}$. \item The rows of $\mathbf{K}$ are orthoganal to the columns of $\mathbf{X}$, meaning $\mathbf{KX} = \mathbf{0}$. \item Then \begin{eqnarray*} \mathbf{Ky} & = & \mathbf{KX} \boldsymbol{\beta} + \mathbf{KZb} + \mathbf{K}\boldsymbol{\epsilon} \\ & = & \mathbf{KZb} + \mathbf{K}\boldsymbol{\epsilon} \\ & \sim & N(\mathbf{0}, \mathbf{KZ}\boldsymbol{\Sigma}_b\mathbf{Z^\prime K^\prime} + \sigma^2 \mathbf{KK}^\prime) \end{eqnarray*} \item Estimate $\boldsymbol{\Sigma}_b$ and $\sigma^2$ by maximum likelihood. \item A big theorem says the resulting ``restricted" MLE does not depend on the choice of $\mathbf{K}$. \end{itemize} \end{frame} \begin{frame} \frametitle{Nice results from restricted maximum likelihood} %\framesubtitle{} \begin{itemize} \item $F$ statistics that correspond to the classical ones for balanced designs. \item For unbalanced designs, ``$F$ statistics" that are actually excellent $F$ approximations --- not quite $F$, but very close. \item R's \texttt{nlme4} package and SAS \texttt{proc mixed}. % \item Like $cov(\boldsymbol{\epsilon})$ can be block diagonal, with useful structures \ldots \end{itemize} \end{frame} \section{Random Intercept Models} \begin{frame} \frametitle{Random Intercept Models for Within-cases} %\framesubtitle{} \begin{itemize} \item Drop the complicated classical mixed model machinery. \item Retain the basic good idea. \item Each subject (person, case) contributes an individual shock that pushes all the data values from that person up or down by the same amount. \pause \item Because cases are randomly sampled (pretend), it's a random shock. \pause \item This is still a mixed model, but it's much simpler. \end{itemize} \end{frame} \begin{frame} \frametitle{Example: The Noise study} %\framesubtitle{} Females and males carry out a discrimination task under 3 levels of background noise. \frametitle{Copyright Information}

This slide show was prepared by \href{http://www.utstat.toronto.edu/brunner}{Jerry Brunner}, Department of Statistics, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. 