\documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb \usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} % for \mathbb{R} The set of reals \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{fullpage} %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 302f15 Assignment Five}}\footnote{Copyright information is at the end of the last page.} \vspace{1 mm} \end{center} \noindent These problems are preparation for the quiz in tutorial on Friday October 15th, and are not to be handed in. \vspace{2mm} \noindent The general linear regression model is $\mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}$, where $\mathbf{X}$ is an $n \times (k+1)$ matrix of observable constants, $\boldsymbol{\beta}$ is a $(k+1) \times 1$ vector of unknown constants (parameters), and $\boldsymbol{\epsilon}$ is an $n \times 1$ vector of unobservable random variables with $E(\boldsymbol{\epsilon})=\mathbf{0}$ and $cov(\boldsymbol{\epsilon})=\sigma^2\mathbf{I}_n$, where $\sigma^2>0$ is an unknown constant parameter. \begin{enumerate} \item For the general linear regression model, what are $E(\mathbf{Y})$ and $cov(\mathbf{Y})$? \item \label{glm} For the general linear regression model, \begin{enumerate} \item Show (there is no difference beween ``show" and ``prove") that the matrix $\mathbf{X^\prime X}$ is symmetric. \item Recall that the $p \times p$ matrix $\mathbf{A}$ is said to be \emph{non-negative definite} if $\mathbf{v}^\prime \mathbf{Av} \geq 0$ for all constant vectors $\mathbf{v} \in \mathbb{R}^p$. Show that $\mathbf{X}^\prime\mathbf{X}$ is non-negative definite. \item Show that if the columns of $\mathbf{X}$ are linearly independent, then $\mathbf{X^\prime X}$ is positive definite. \item Show that if $\mathbf{X^\prime X}$ is positive definite, then $(\mathbf{X^\prime X})^{-1}$ exists. \item Show that if $(\mathbf{X^\prime X})^{-1}$ exists, then the columns of $\mathbf{X}$ are linearly independent. \end{enumerate} This is a good problem because it establishes that the least squares estimator $\widehat{\boldsymbol{\beta}} = (\mathbf{X}^\prime\mathbf{X})^{-1}\mathbf{X}^\prime\mathbf{Y}$ exists if and only if the columns of $\mathbf{X}$ are linearly independent, meaning that no independent variable is a linear combination of the other ones. \item Is $\widehat{\boldsymbol{\beta}}$ an unbiased estimator of $\boldsymbol{\beta}$? Answer Yes or No and show your work. \item Calculate $cov(\widehat{\boldsymbol{\beta}})$ and simplify. Show your work. \item Define $\widehat{\mathbf{Y}} = \mathbf{X}\widehat{\boldsymbol{\beta}} = \mathbf{HY}$, where $\mathbf{H} = \mathbf{X}(\mathbf{X}^\prime\mathbf{X})^{-1}\mathbf{X}^\prime$. Define $\widehat{\boldsymbol{\epsilon}} = \mathbf{Y}-\widehat{\mathbf{Y}}$. \begin{enumerate} \item What are the dimensions of the matrix $\mathbf{H}$? \item Show that $\mathbf{H}$ is symmetric. \item Show that $\mathbf{H}$ is idempotent, meaning $\mathbf{H} = \mathbf{H}^2$ \item Show that $\widehat{\boldsymbol{\epsilon}} = (\mathbf{I}-\mathbf{H})\mathbf{Y}$. \item Show that $\mathbf{I}-\mathbf{H}$ is symmetric. \item Show that $\mathbf{I}-\mathbf{H}$ is idempotent \item What are the dimensions of the matrix $\widehat{\boldsymbol{\beta}}$? \item What are the dimensions of the matrix $\widehat{\mathbf{Y}}$? \item What is $E(\widehat{\mathbf{Y}})$? Show your work. \item What is $cov(\widehat{\mathbf{Y}})$? Show your work. It is easier if you use $\mathbf{H}$. \item What are the dimensions of the matrix $\widehat{\boldsymbol{\epsilon}}$? \item What is $E(\widehat{\boldsymbol{\epsilon}})$? Show your work. Is $\widehat{\boldsymbol{\epsilon}}$ an unbiased estimator of $\boldsymbol{\epsilon}$? This is a trick question, and requires thought. \item What is $cov(\widehat{\boldsymbol{\epsilon}})$? Show your work. It is easier if you use $\mathbf{I}-\mathbf{H}$. % \item What are the dimensions of the matrix ? % \item What is $E()$? Show your work. % \item What is $cov()$? Show your work. \end{enumerate} \item \label{perpendicular} Show that $\mathbf{X}^\prime \hat{\boldsymbol{\epsilon}} = \mathbf{0}$. If the statement is false (not true in general), explain why it is false. %%%%%%%%%%%%%%%%%%%% \item \label{scalar} The scalar form of the general linear regression model is \begin{displaymath} Y_i = \beta_0 + \beta_1 x_{i1} + \cdots + \beta_k x_{ik} + \epsilon_i, \end{displaymath} where $\epsilon_1, \ldots, \epsilon_n$ are a random sample from a distribution with expected value zero and variance $\sigma^2$. The numbers $x_{ij}$ are known, observed constants, while $\beta_0, \ldots, \beta_k$ and $\sigma^2$ are unknown constants (parameters). The term ``ranom sample" means independent and identically distributed in this course, so the $\epsilon_i$ random variables have zero covariance with one another. \begin{enumerate} \item What is $E(Y_i)$? \item What is $Var(Y_i)$? \item What is $Cov(Y_i,Y_j)$ for $i \neq j$? \end{enumerate} \item Starting with the scalar form of the linear regression model (see Question~\ref{scalar}), we obtain least-squares estimates of the $\beta$ values by minimizing the sum of squared differences between observed $Y_i$ and $E(Y_i)$. That is, we choose $\beta_0, \ldots, \beta_k$ to make \begin{displaymath} Q(\boldsymbol{\beta})=\sum_{i=1}^n(Y_i-\beta_0 - \beta_1 x_{i1} - \cdots - \beta_k x_{ik})^2 \end{displaymath} as small as possible. \begin{enumerate} \item Differentiate $Q(\boldsymbol{\beta})$ with respect to $\beta_0$ and set the derivative to zero, obtaining the first \emph{normal equation}. \item Noting that the quantities $\widehat{\beta}_0, \ldots, \widehat{\beta}_k$ (whatever they are) must satisfy the first normal equation, show that the least squares plane must pass through the point $(\overline{x}_1, \overline{x}_2, \ldots, \overline{x}_k, \overline{Y})$. \item Defining ``predicted" $Y_i$ as $\widehat{Y}_i = \widehat{\beta}_0 + \widehat{\beta}_1 x_{i1} + \cdots + \widehat{\beta}_k x_{ik}$, show that $\sum_{i=1}^n \widehat{Y}_i = \sum_{i=1}^n Y_i$. \item The \emph{residual} for observation $i$ is defined by $\widehat{\epsilon}_i = Y_i - \widehat{Y}_i$. Show that the sum of residuals equals exactly zero. \end{enumerate} \item \label{simple} ``Simple" regression is just regression with a single independent variable. The model equation is $Y_i = \beta_0 + \beta_1 x_i + \epsilon_i$. Fitting this simple regression problem into the matrix framework of the general linear regression model, \begin{enumerate} \item What is the $\mathbf{X}$ matrix? \item What is $\mathbf{X^\prime X}$? \item What is $\mathbf{X^\prime Y}$? \item What is $(\mathbf{X^\prime X})^{-1}$? \end{enumerate} \item In Question \ref{simple}, the model had both an intercept and one independent variable. But suppose the model has no intercept. This is called simple \emph{regression through the origin}. The model would be $Y_i = \beta_1 x_i + \epsilon_i$. \begin{enumerate} \item What is the $\mathbf{X}$ matrix? \item What is $\mathbf{X^\prime X}$? \item What is $\mathbf{X^\prime Y}$? \item What is $(\mathbf{X^\prime X})^{-1}$? \end{enumerate} \item There can even be a regression model with an intercept but no independent variable. In this case the model would be $Y_i = \beta_0 + \epsilon_i$. \begin{enumerate} \item Find the least squares estimator $\widehat{\beta}_0$ with calculus. \item What is the $\mathbf{X}$ matrix? \item What is $\mathbf{X^\prime X}$? \item What is $\mathbf{X^\prime Y}$? \item What is $(\mathbf{X^\prime X})^{-1}$? \item Verify that your expression for $\widehat{\beta}_0$ agrees with $\widehat{\boldsymbol{\beta}} = (\mathbf{X}^\prime\mathbf{X})^{-1}\mathbf{X}^\prime\mathbf{Y}$. \end{enumerate} \item Referring to the matrix version of the linear model and letting $\widehat{\boldsymbol{\beta}} = (\mathbf{X}^\prime\mathbf{X})^{-1}\mathbf{X}^\prime\mathbf{Y}$ (which implies that the columns of $\mathbf{X}$ must be linearly independent), show that \linebreak $(\mathbf{Y}-\widehat{\mathbf{Y}})^\prime (\widehat{\mathbf{Y}} - \mathbf{X}\boldsymbol{\beta}) = \mathbf{0}$. \item Using the result of the preceding question and writing $Q(\boldsymbol{\beta})$ as $Q(\boldsymbol{\beta}) = (\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^\prime (\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})$, show that $Q(\boldsymbol{\beta}) = (\mathbf{Y}-\mathbf{X}\widehat{\boldsymbol{\beta}})^\prime (\mathbf{Y}-\mathbf{X}\widehat{\boldsymbol{\beta}}) + (\widehat{\boldsymbol{\beta}}-\boldsymbol{\beta})^\prime (\mathbf{X^\prime X}) (\widehat{\boldsymbol{\beta}}-\boldsymbol{\beta})$. Why does this imply that the minimum of $Q(\boldsymbol{\beta})$ occurs at $\boldsymbol{\beta} = \widehat{\boldsymbol{\beta}}$? How do you know that the minimum is unique? \item The set of vectors $\mathcal{V} = \{\mathbf{v} = \mathbf{Xb}: \mathbf{b} \in \mathbb{R}^{k+1}\}$ is the subset of $\mathbb{R}^{n}$ consisting of linear combinations of the columns of $\mathbf{X}$. That is, $\mathcal{V}$ is the space \emph{spanned} by the columns of $\mathbf{X}$. The least squares estimator $\widehat{\boldsymbol{\beta}} = (\mathbf{X}^\prime\mathbf{X})^{-1}\mathbf{X}^\prime\mathbf{Y}$ was obtained by minimizing $(\mathbf{Y}-\mathbf{Xb})^\prime(\mathbf{Y}-\mathbf{Xb})$ over all $\mathbf{b} \in \mathbb{R}^{k+1}$. Thus, $\widehat{\mathbf{Y}} = \mathbf{X}\widehat{\boldsymbol{\beta}}$ is the point in $\mathcal{V}$ that is \emph{closest} to the data vector $\mathbf{Y}$. Geometrically, $\widehat{\mathbf{Y}}$ is the \emph{projection} (shadow) of $\mathbf{Y}$ onto $\mathcal{V}$. The hat matrix $\mathbf{H}$ is a \emph{projection matrix}. It projects the image on any point in $\mathbb{R}^{n}$ onto $\mathcal{V}$. Now we will test out several consequences of this idea. \begin{enumerate} \item The shadow of a point already in $\mathcal{V}$ should be right at the point itself. Show that if $\mathbf{v} \in \mathcal{V}$, then $\mathbf{Hv}= \mathbf{v}$. \item The vector of differences $\widehat{\boldsymbol{\epsilon}} = \mathbf{Y} - \widehat{\mathbf{Y}}$ should be perpendicular (at right angles) to each and every basis vector of $\mathcal{V}$. How is this related to Question~\ref{perpendicular}? \item Show that the vector of residuals $\widehat{\boldsymbol{\epsilon}}$ is perpendicular to any $\mathbf{v} \in \mathcal{V}$. \end{enumerate} % Gauss-Markov next time, including regression through the origin and simple independent random sample. \end{enumerate} %\vspace{110mm} \noindent \begin{center}\begin{tabular}{l} \hspace{6in} \\ \hline \end{tabular}\end{center} This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistical Sciences, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/302f15} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/302f15}} \end{document} \item What is $E(\widehat{\boldsymbol{\beta}})$? Show your work. Is $E(\widehat{\boldsymbol{\beta}})$ an unbiased estimator? \item What is $cov(\widehat{\boldsymbol{\beta}})$? Show your work.