% 302f16Assignment4.tex \documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb \usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{fullpage} %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 302f16 Assignment Four}}\footnote{Copyright information is at the end of the last page.} \vspace{1 mm} \end{center} % In an effort to use the text, I've commented out some material that may be useful for quiz and exam questions. \noindent The general linear regression model is $\mathbf{y} = X\boldsymbol{\beta} + \boldsymbol{\epsilon}$, where $X$ is an $n \times (k+1)$ matrix of observable constants, $\boldsymbol{\beta}$ is a $(k+1) \times 1$ vector of unknown constants (parameters), and $\boldsymbol{\epsilon}$ is an $n \times 1$ vector of unobservable random variables with $E(\boldsymbol{\epsilon})=\mathbf{0}$ and $cov(\boldsymbol{\epsilon})=\sigma^2I_n$. The error variance $\sigma^2>0$ is an unknown constant parameter. \begin{enumerate} \item \label{glm} For the general linear regression model, \begin{enumerate} \item Show (there is no difference beween ``show" and ``prove") that the matrix $X^\prime X$ is symmetric. \item Show that $X^\prime X$ is non-negative definite. \item Show that if the columns of $X$ are linearly independent, then $X^\prime X$ is positive definite. \item Show that if $X^\prime X$ is positive definite, then $(X^\prime X)^{-1}$ exists. \item Show that if $(X^\prime X)^{-1}$ exists, then the columns of $X$ are linearly independent. \end{enumerate} This is a good problem because it establishes that the least squares estimator $\mathbf{b} = (X^\prime X)^{-1}X^\prime\mathbf{y}$ exists if and only if the columns of $X$ are linearly independent, meaning that no independent variable is a linear combination of the other ones. \item Let $\widehat{\mathbf{y}} = X\mathbf{b} = H\mathbf{y}$, where $H = X(X^\prime X)^{-1}X^\prime$. The residuals are in the vector $\mathbf{e} = \mathbf{y}-\widehat{\mathbf{y}}$. \begin{enumerate} \item What are the dimensions of the matrix $H$? Give the number of rows and the number of columns. \item Show that $H$ is symmetric. \item Show that $H$ is idempotent, meaning $H = H^2$ \item Using $tr(AB)=tr(BA)$, find $tr(H)$. \item Show that $\mathbf{e} = (I-H)\mathbf{y}$. \item Show that $M = I-H$ is symmetric. \item Show that $M$ is idempotent. \item Using $tr(AB)=tr(BA)$, find $tr(M)$. % \item What are the dimensions of the matrix ? % \item What is $E()$? Show your work. % \item What is $cov()$? Show your work. \end{enumerate} \item Please read Chapter 2, pages 28-37 in the textbook. % Just before decomposition of SS. \begin{enumerate} \item This question starts with something you have already done. For the case of simple regression with $k=1$ independent variables, partially differentiate $\mathcal{S}$ --- defined in the first line of (2.6) -- with respect the $\beta_0$ and $\beta_1$. Set both derivatives to zero, obtaining two equations in two unknowns. Now here's the new part. Write these equations in matrix form, obtaining a special case of (2.8). \begin{enumerate} \item What is the $X^\prime X$ matrix? It is a $2 \times 2$ matrix with a formula in each cell. \item What is the $X^\prime \mathbf{y}$ matrix? It is a $2 \times 1$ matrix with a formula in each cell. \end{enumerate} \item Show that $M\boldsymbol{\epsilon}=\mathbf{e}$. \item \label{perpendicular} Prove that $X^\prime \mathbf{e} = \mathbf{0}$. If the statement is false (not true in general), explain why it is false. \item Prove Theorem 2.1 in the text. I know this is a bit redundant. \item Why does $X^\prime\mathbf{e}=\mathbf{0}$ tell you that if a regression model has an intercept, the residuals must add up to zero? \item Letting $\mathcal{S} = (\mathbf{y}-X\boldsymbol{\beta})^\prime (\mathbf{y}-X\boldsymbol{\beta})$, show that \begin{displaymath} \mathcal{S} = (\mathbf{y}-X\mathbf{b})^\prime (\mathbf{y}-X\mathbf{b}) + (\mathbf{b}-\boldsymbol{\beta})^\prime (X^\prime X) (\mathbf{b}-\boldsymbol{\beta}) . \end{displaymath} Why does this imply that the minimum of $\mathcal{S}(\boldsymbol{\beta})$ occurs at $\boldsymbol{\beta} = \mathbf{b}$? The columns of $X$ are linearly independent. Why does linear independence guarantee that the minimum is unique? \item What are the dimensions of the random vector $\mathbf{b}$ as defined in Expression (2.9)? \item Is $\mathbf{b}$ an unbiased estimator of $\boldsymbol{\beta}$? Answer Yes or No and show your work. \item Calculate $cov(\mathbf{b})$ and simplify. Show your work. \item What are the dimensions of the random vector $\widehat{\mathbf{y}}$? \item What is $E(\widehat{\mathbf{y}})$? Show your work. \item What is $cov(\widehat{\mathbf{y}})$? Show your work. It is easier if you use $H$. \item What are the dimensions of the random vector $\mathbf{e}$? \item What is $E(\mathbf{e})$? Show your work. Is $\mathbf{e}$ an unbiased estimator of $\boldsymbol{\epsilon}$? This is a trick question, and requires thought. \item What is $cov(\mathbf{e})$? Show your work. It is easier if you use $I-H$. \item Prove $E(\mathbf{e}^\prime\mathbf{e}) = \sigma^2(n-k-1)$ \item Do Exercises 2.1, 2.3 and 2.6 in the text. \end{enumerate} %%%%%%%%%%%%%%%%%%%% \item \label{scalar} The scalar form of the general linear regression model is \begin{displaymath} y_i = \beta_0 + \beta_1 x_{i1} + \cdots + \beta_k x_{ik} + \epsilon_i, \end{displaymath} where $\epsilon_1, \ldots, \epsilon_n$ are a random sample from a distribution with expected value zero and variance $\sigma^2$. The numbers $x_{ij}$ are known, observed constants, while $\beta_0, \ldots, \beta_k$ and $\sigma^2$ are unknown constants (parameters). The term ``ranom sample" means independent and identically distributed in this course, so the $\epsilon_i$ random variables have zero covariance with one another. \begin{enumerate} \item What is $E(y_i)$? \item What is $Var(y_i)$? \item What is $Cov(y_i,y_j)$ for $i \neq j$? \end{enumerate} \item In \emph{simple regression through the origin}, there is one independent variable and no intercept. The model is $y_i = \beta_1 x_i + \epsilon_i$. \begin{enumerate} \item What is the $X$ matrix? \item What is $X^\prime X$? \item What is $X^\prime \mathbf{y}$? \item What is $(X^\prime X)^{-1}$? \item What is $b_1 = (X^\prime X)^{-1}X^\prime\mathbf{y}$? Compare your answer to (1.22) on page 11 in the textbook. \end{enumerate} \item There can even be a regression model with an intercept and no independent variables. In this case the model would be $y_i = \beta_0 + \epsilon_i$. \begin{enumerate} \item \label{ybar} Find the least squares estimator of $\beta_0$ with calculus. \item What is the $X$ matrix? \item What is $X^\prime X$? \item What is $X^\prime \mathbf{y}$? \item What is $(X^\prime X)^{-1}$? \item What is $b_0 = (X^\prime X)^{-1}X^\prime\mathbf{y}$? Compare this with your answer to Question~\ref{ybar}. \end{enumerate} \item The set of vectors $\mathcal{V} = \{\mathbf{v} = X\mathbf{a}: \mathbf{a} \in \mathbb{R}^{k+1}\}$ is the subset of $\mathbb{R}^{n}$ consisting of linear combinations of the columns of $X$. That is, $\mathcal{V}$ is the space \emph{spanned} by the columns of $X$. The least squares estimator $\mathbf{b} = (X^\prime X)^{-1}X^\prime\mathbf{y}$ was obtained by minimizing $(\mathbf{y}-X\mathbf{a})^\prime(\mathbf{y}-X\mathbf{a})$ over all $\mathbf{a} \in \mathbb{R}^{k+1}$. Thus, $\widehat{\mathbf{y}} = X\mathbf{b}$ is the point in $\mathcal{V}$ that is \emph{closest} to the data vector $\mathbf{y}$. Geometrically, $\widehat{\mathbf{y}}$ is the \emph{projection} (shadow) of $\mathbf{y}$ onto $\mathcal{V}$. The hat matrix $H$ is a \emph{projection matrix}. It projects the image on any point in $\mathbb{R}^{n}$ onto $\mathcal{V}$. Now we will test out several consequences of this idea. \begin{enumerate} \item The shadow of a point already in $\mathcal{V}$ should be right at the point itself. Show that if $\mathbf{v} \in \mathcal{V}$, then $H\mathbf{v}= \mathbf{v}$. \item The vector of differences $\mathbf{e} = \mathbf{y} - \widehat{\mathbf{y}}$ should be perpendicular (at right angles) to each and every basis vector of $\mathcal{V}$. How is this related to Question~\ref{perpendicular}? \item Show that the vector of residuals $\mathbf{e}$ is perpendicular to any $\mathbf{v} \in \mathcal{V}$. \end{enumerate} \end{enumerate} %\vspace{60mm} \noindent \begin{center}\begin{tabular}{l} \hspace{6in} \\ \hline \end{tabular}\end{center} This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistical Sciences, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/302f16} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/302f16}} \end{document} \item \label{simple} ``Simple" regression is just regression with a single independent variable. The model equation is $y_i = \beta_0 + \beta_1 x_i + \epsilon_i$. Fitting this simple regression problem into the matrix framework of the general linear regression model, \begin{enumerate} \item What is the $X$ matrix? \item What is $X^\prime X$? \item What is $X^\prime \mathbf{y}$? \end{enumerate}