Structural Equation Models

An extension of multiple regression.
Can incorporate measurement error.
More than one regression-like equation.
All the variables are random.
An explanatory variable in one equation can be the response variable in another equation.

Measurement Error

What you see is not what you really want.
Latent variable: A random variable whose values cannot be directly observed. For example, family income last year.
Contrast with Observable variable. For example, reported family income last year.
Usually, interest is in relationships between latent variables.
But all you have in your data set are the observable variables.

Doubly Labeled Water
Participants drink water that is enriched with respect to two isotopes, and urine samples allow the measurement of energy expenditure (Graphics used without permission).

Path diagrams
Example: Exercise and arthritis pain

Comments

Latent variables are in ovals, observable variables are in boxes.
Error terms seem to come from nowhere -- often not shown.
There is real modeling here. Lots of theoretical input is required.
These are usually interpreted as causal models: Models of influence.
$A \rightarrow B$ means $A$ has an influence on $B$.
But the data are usually observational.

Path diagrams correspond to systems of equations

{\scriptsize \begin{eqnarray*} Y_{i,1} & = & \beta_{0,1} + \beta_1 X_i + \epsilon_{i,1} \\ Y_{i,2} & = & \beta_{0,2} + \beta_2 Y_{i,1} + \epsilon_{i,2} \\ Y_{i,3} & = & \beta_{0,3} + \beta_3 X_i + \beta_4 Y_{i,2} + \epsilon_{i,3} \\ Y_{i,4} & = & \beta_{0,4} + \beta_5 Y_{i,2} + \beta_6 Y_{i,3} + \epsilon_{i,4} \\ D_{i,1} & = & \lambda_{0,1} + \lambda_1 Y_{i,1} + e_{i,1} \\ D_{i,2} & = & \lambda_{0,2} + \lambda_2 X_i + e_{i,2} \\ D_{i,3} & = & \lambda_{0,3} + \lambda_3 Y_{i,2} + e_{i,3} \\ D_{i,4} & = & \lambda_{0,4} + \lambda_4 Y_{i,3} + e_{i,4} \\ D_{i,5} & = & \lambda_{0,5} + \lambda_2 X_i + e_{i,5} \\ D_{i,6} & = & \lambda_{0,6} + \lambda_5 Y_{i,4} + e_{i,6} \\ \end{eqnarray*} }

Multivariate normal model is standard.

Regression with observable variables

\begin{displaymath} Y_i = \beta_0 + \beta_1 X_{i,1} + \beta_2 X_{i,2} + \beta_3 X_{i,3} + \epsilon_i \end{displaymath}

Tools

Scalar variance-covariance calculations
Matrices
Random vectors
Multivariate normal
Maximum likelihood
A little large-sample theory
SAS

Copyright Information
This slide show was prepared by Jerry Brunner, Department of Statistical Sciences, University of Toronto. Except for the picture taken from Carroll et al.'s Measurement error in non-linear models, it is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely.