\documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb \usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} % for \mathbb{R} The set of reals \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{fullpage} %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 302f15 Assignment Eight}}\footnote{Copyright information is at the end of the last page.} \vspace{1 mm} \end{center} \noindent In the general linear model, assume that $\boldsymbol{\epsilon} \sim N(\mathbf{0},\sigma^2\mathbf{I}_n)$ Also assume that the columns of the $\mathbf{X}$ matrix are linearly independent, so that the formulas for $\widehat{\boldsymbol{\beta}}$ and related quantities apply. You may use anything from the formula sheet unless you are explicitly asked to prove it, or are instructed otherwise. Use moment-generating functions \emph{only} if the question directly asks you to do it. \begin{enumerate} \item Label each of the following statements True (meaning always true) or False (meaning not always true), and show your work or explain. \begin{enumerate} \item $\widehat{\mathbf{y}} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}$ \item $\mathbf{y} = \mathbf{X} \widehat{\boldsymbol{\beta}} + \widehat{\boldsymbol{\epsilon}}$. \item $\widehat{\mathbf{y}} = \mathbf{X} \widehat{\boldsymbol{\beta}} + \widehat{\boldsymbol{\epsilon}}$ \item $\mathbf{y} = \mathbf{X} \boldsymbol{\beta}$ \item $\mathbf{X}^\prime\boldsymbol{\epsilon} = \mathbf{0}$ \item $(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^\prime (\mathbf{y}-\mathbf{X}\boldsymbol{\beta}) = \boldsymbol{\epsilon}^\prime\boldsymbol{\epsilon}$. \item $\widehat{\boldsymbol{\epsilon}}^\prime \, \widehat{\boldsymbol{\epsilon}} = \mathbf{0}$ \item $\widehat{\boldsymbol{\epsilon}}^\prime \, \widehat{\boldsymbol{\epsilon}} = \mathbf{y}^\prime \, \widehat{\boldsymbol{\epsilon}}$. \item $W = \frac{\boldsymbol{\epsilon}^\prime\boldsymbol{\epsilon}}{\sigma^2}$ has a chi-squared distribution. \item $E(\boldsymbol{\epsilon}^\prime\boldsymbol{\epsilon})=0$ \item $E(\widehat{\boldsymbol{\epsilon}}^\prime \, \widehat{\boldsymbol{\epsilon}})=0$ \end{enumerate} \item What is the distribution of $\mathbf{s}_1 = \mathbf{X}^\prime\boldsymbol{\epsilon}$? Show the calculation of expected value and variance-covariance matrix. \item What is the distribution of $\mathbf{s}_2 = \mathbf{X}^\prime \, \widehat{\boldsymbol{\epsilon}}$? \begin{enumerate} \item Answer the question. \item Show the calculation of expected value and variance-covariance matrix. \item Is this a surprise? Answer Yes or No. \item What is the probability that $\mathbf{s}_2=\mathbf{0}$? The answer is a single number. \end{enumerate} \pagebreak \item The following are some distribution facts you are expected to know. Just give the answers. Only re-derive them if you can't remember. \begin{enumerate} \item Let $X\sim N(\mu,\sigma^2)$ and $Y=aX+b$, where $a$ and $b$ are constants. What is the distribution of $Y$? \item Let $X\sim N(\mu,\sigma^2)$ and $Z = \frac{X-\mu}{\sigma}$. What is the distribution of $Z$? \item Let $Z \sim N(0,1)$. What is the distribution of $Y=Z^2$? \item Let $X_1, \ldots, X_n$ independent $N(\mu,\sigma^2)$ random variables. What is the distribution of the sample mean $\overline{X}$? \item Let $X_1, \ldots, X_n$ independent $N(\mu,\sigma^2)$ random variables. What is the distribution of $Z = \frac{\sqrt{n}(\overline{X}-\mu)}{\sigma}$? \item Let $W_1, \ldots, W_n$ be independent $\chi^2(1)$ random variables. What is the distribution of $Y = \sum_{i=1}^n W_i$? \item Let $X_1, \ldots, X_n$ independent $N(\mu,\sigma^2)$ random variables. What is the distribution of $Y = \frac{1}{\sigma^2} \sum_{i=1}^n\left(X_i-\mu \right)^2$? \item Let $Y=X_1+X_2$, where $X_1$ and $X_2$ are independent, $X_1\sim\chi^2(\nu_1)$ and $Y\sim\chi^2(\nu_1+\nu_2)$, where $\nu_1$ and $\nu_2$ are both positive. What is the distribution of $X_2$? \end{enumerate} \item In an earlier Assignment, you proved that \begin{displaymath} (\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^\prime (\mathbf{Y}-\mathbf{X}\boldsymbol{\beta}) = \widehat{\boldsymbol{\epsilon}}^\prime \, \widehat{\boldsymbol{\epsilon}} + (\widehat{\boldsymbol{\beta}}-\boldsymbol{\beta})^\prime (\mathbf{X^\prime X}) (\widehat{\boldsymbol{\beta}}-\boldsymbol{\beta}). \end{displaymath} Starting with this expression, show that $SSE/\sigma^2 \sim \chi^2(n-k-1)$. Use the formula sheet. \item The $t$ distribution is defined as follows. Let $Z\sim N(0,1)$ and $W \sim \chi^2(\nu)$, with $Z$ and $W$ independent. Then $T = \frac{Z}{\sqrt{W/\nu}}$ is said to have a $t$ distribution with $\nu$ degrees of freedom, and we write $T \sim t(\nu)$. For the general fixed effects linear regression model, tests and confidence intervals for linear combinations of regression coefficients are very useful. Derive the appropriate $t$ distribution and some applications by following these steps. Let $\mathbf{a}$ be a $p \times 1$ vector of constants. \begin{enumerate} \item What is the distribution of $\mathbf{a}^\prime \widehat{\boldsymbol{\beta}}$? Show a little work. Your answer includes both the expected value and the variance. \item Now standardize the difference (subtract off the mean and divide by the standard deviation) to obtain a standard normal. \item Divide by the square root of a well-chosen chi-squared random variable, divided by its degrees of freedom, and simplify. Call the result $T$. \item How do you know numerator and denominator are independent? \item Suppose you wanted to test $H_0: \mathbf{a}^\prime\boldsymbol{\beta} = c$. Write down a formula for the test statistic. \item For a regression model with four independent variables, suppose you wanted to test $H_0: \beta_2=0$. Give the vector $\mathbf{a}$. \item For a regression model with four independent variables, suppose you wanted to test $H_0: \beta_1=\beta_2$. Give the vector $\mathbf{a}$. \item Letting $t_{\alpha/2}$ denote the point cutting off the top $\alpha/2$ of the $t$ distribution with $n-k-1$ degrees of freedom, derive the $(1-\alpha) \times 100\%$ confidence interval for $\mathbf{a}^\prime\boldsymbol{\beta}$. ``Derive" means show the High School algebra. \end{enumerate} \item For a multiple regression model with an intercept, let $SST = \sum_{i=1}^n(Y_i-\overline{Y})^2$, $SSE = \sum_{i=1}^n(Y_i-\widehat{Y}_i)^2$ and $SSR = \sum_{i=1}^n(\widehat{Y}_i-\overline{Y})^2$, show $SST=SSR+SSE$. \item Still for a multiple regression model with an intercept, show that $\overline{Y}$ is a function of $\widehat{\boldsymbol{\beta}}$. Why does this establish that $SSR$ and $SSE$ are independent? \item Continue assuming that the regression model has an intercept. If $H_0: \beta_1 = \cdots = \beta_k = 0$ is true, \begin{enumerate} \item What is the distribution of $Y_i$? \item What is the distribution of $\frac{SST}{\sigma^2}$? Just write down the answer. You already did it in Assignment 2, and again in Assignment 5. \end{enumerate} \item Still assuming $H_0: \beta_1 = \cdots = \beta_k = 0$ is true, what is the distribution of $SSR/\sigma^2$? Use the formula sheet and show your work. \item \label{Fstat} Recall the definition of the $F$ distribution. If $W_1 \sim \chi^2(\nu_1)$ and $W_2 \sim \chi^2(\nu_2)$ are independent, $F = \frac{W_1/\nu_1}{W_2/\nu_2} \sim F(\nu_1,\nu_2)$. Show that $F = \frac{SSR/k}{SSE/(n-k-1)}$ has an $F$ distribution under $H_0: \beta_1 = \cdots = \beta_k = 0$? Refer to the results of questions above as you use them. \item The null hypothesis $H_0: \beta_1 = \cdots = \beta_k = 0$ is less and less believable as $R^2$ becomes larger. Show that the $F$ statistic of Question~\ref{Fstat} is an increasing function of $R^2$ for fixed $n$ and $k$. This mean it makes sense to reject $H_0$ for large values of $F$. \end{enumerate} \vspace{20mm} \noindent \begin{center}\begin{tabular}{l} \hspace{6in} \\ \hline \end{tabular}\end{center} This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistical Sciences, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/302f15} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/302f15}} \end{document}