\documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb \usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{fullpage} %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 2101/442 Assignment 3}}\footnote{This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistics, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/appliedf18} {\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/appliedf18}}} \vspace{1 mm} \end{center} \noindent These questions are practice for the midterm and final exam, and are not to be handed in. \begin{enumerate} %%%%%%%%%%%% Univariate Delta Method %%%%%%%%%% \item Suppose $X_1, \ldots, X_n$ are a random sample from a distribution with mean $\mu$ and variance $\sigma^2$. The central limit theorem says $\sqrt{n}\left(\overline{X}_n-\mu \right) \stackrel{d}{\rightarrow} T \sim N(0,\sigma^2)$. One version of the delta method says that if $g(x)$ is a function whose derivative is continuous in a neighbourhood of $x=\mu$, then $\sqrt{n}\left( g(\overline{X}_n)- g(\mu) \right) \stackrel{d}{\rightarrow} g^\prime(\mu) T$. In many applications, both $\mu$ and $\sigma^2$ are functions of some parameter $\theta$. \begin{enumerate} \item Let $X_1, \ldots, X_n$ be a random sample from a Bernoulli distribution with parameter $\theta$. Find the limiting distribution of \begin{displaymath} Z_n = 2\sqrt{n}\left(\sin^{-1}\sqrt{\overline{X}_n}-\sin^{-1}\sqrt{\theta}\right). \end{displaymath} Hint: $\frac{d}{dx} \sin^{-1}(x) = \frac{1}{\sqrt{1-x^2}}$. The measurements are in radians, not degrees. \item In a coffee taste test, 100 coffee drinkers tasted coffee made with two different blends of coffee beans, the old standard blend and a new blend. We will adopt a Bernoulli model for these data, with $\theta$ denoting the probability that a customer will prefer the new blend. Suppose 60 out of 100 consumers preferred the new blend of coffee beans. Using your answer to the first part of this question, test $H_0: \theta=\frac{1}{2}$ using a variance-stabilized test statistic. Give the value of the test statistic (a number), and state whether you reject $H_0$ at the usual $\alpha=0.05$ significance level. In plain, non-statistical language, what do you conclude? This is a statement about preference for types of coffee, and of course you will draw a directional conclusion if possible. \item If the probability of an event is $p$, the \emph{odds} of the event is (are?) defined as $p/(1-p)$. Suppose again that $X_1, \ldots, X_n$ are a random sample from a Bernoulli distribution with parameter $\theta$. In this case the \emph{log odds} of $X_i=1$ would be estimated by \begin{displaymath} Y_n = \log \frac{\overline{X}_n}{1-\overline{X}_n}. \end{displaymath} Naturally, that's the natural log. Find the approximate large-sample distribution (that is, the asymptotic distribution) of $Y_n$. It's normal, of course. Your job is to give the approximate (that is, asymptotic) mean and variance of $Y_n$. \item Again using the Taste Test data, give a 95\% confidence interval for the log odds of preferring the new brand. Your answer is a pair of numbers. \pagebreak \item Let $X_1, \ldots, X_n$ be a random sample from an exponential distribution with parameter $\theta$, so that $E(X_i)=\theta$ and $Var(X_i)=\theta^2$. \begin{enumerate} \item Find a variance-stabilizing transformation. That is, find a function $g(x)$ such that the limiting distribution of \begin{displaymath} Y_n = \sqrt{n}\left(g(\overline{X}_n)-g(\theta)\right) \end{displaymath} does not depend on $\theta$. \item According to a Poisson process model for calls answered by a service technician, service times (that is, time intervals between taking 2 successive calls; there is always somebody on hold) are independent exponential random variables with mean $\theta$. In 50 successive calls, one technician's mean service time was 3.4 minutes. Test whether this technician's mean service time differs from the mandated average time of 3 minutes. Use your answer to the first part of this question. \end{enumerate} \end{enumerate} \item Let $X_1, \ldots, X_n$ be a random sample from a uniform distribution on $(0,\theta)$. \begin{enumerate} \item What is the limiting distribution of $\sqrt{n}\left(\overline{X}_n-\mu \right)$? Just give the answer; there is no need to show any work. \item What is the limiting distribution of $2\sqrt{n}\left(\overline{X}_n-\mu \right)$? Just give the answer; there is no need to show any work. But what Slutsky Lemma are you using? Check the lecture slides if necessary. \item Find a variance-stabilizing transformation that produces a standard normal distribution. That is, letting $T_n = 2\overline{X}_n$, find a function $g(x)$ such that the limiting distribution of \begin{displaymath} Y_n = \sqrt{n}\left(g(T_n)-g(\theta)\right) \end{displaymath} is standard normal. % g(x) = sqrt(3) log(x) \end{enumerate} \item The label on the peanut butter jar says peanuts, partially hydrogenated peanut oil, salt and sugar. But we all know there is other stuff in there too. There is very good reason to assume that the number of rat hairs in a 500g jar of peanut butter has a Poisson distribution with mean $\lambda$, because it's easy to justify a Poisson process model for how the hairs get into the jars. A sample of 30 jars of Brand $A$ yields $\overline{X}=6.8$, while an independent sample of 40 jars of Brand $B$ yields $\overline{Y}=7.275$. \begin{enumerate} \item State the model for this problem. \item What is the parameter space $\Theta$? \item State the null hypothesis in symbols. \item Find a variance-stabilizing transformation for the Poisson distribution. \item Using your variance-stabilizing transformation, derive a test statistic that has an approximate standard normal distribution under $H_0$. \item Calculate your test statistic for these data. Do you reject the null hypothesis at $\alpha=0.05?$ Answer Yes or No. \item In plain, non-statistical language, what do you conclude? Your answer is something about peanut butter and rat hairs. \end{enumerate} %%%%%%%%% Random Matrices and MVN %%%%%%%%%%%%%% \item If the $p \times 1$ random vector $\mathbf{x}$ has variance-covariance matrix $\mathbf{\Sigma}$ and $\mathbf{A}$ is an $m \times p$ matrix of constants, prove that the variance-covariance matrix of $\mathbf{Ax}$ is $\mathbf{A \Sigma A}^\top$. Start with the definition of a variance-covariance matrix: \begin{displaymath} cov(\mathbf{Z})=E(\mathbf{Z}-\boldsymbol{\mu}_z)(\mathbf{Z}-\boldsymbol{\mu}_z)^\top. \end{displaymath} \item If the $p \times 1$ random vector $\mathbf{x}$ has mean $\boldsymbol{\mu}$ and variance-covariance matrix $\mathbf{\Sigma}$, show $\mathbf{\Sigma} = E(\mathbf{xx}^\top) - \boldsymbol{\mu \mu}^\top$. \item Let the $p \times 1$ random vector $\mathbf{x}$ have mean $\boldsymbol{\mu}$ and variance-covariance matrix $\mathbf{\Sigma}$, and let $\mathbf{c}$ be a $p \times 1$ vector of constants. Find $cov(\mathbf{x}+\mathbf{c})$. Show your work. \item \label{AxB} Let the $p \times 1$ random vector $\mathbf{x}$ have mean $\boldsymbol{\mu}$ and variance-covariance matrix $\mathbf{\Sigma}$; let $\mathbf{A}$ be a $q \times p$ matrix of constants and let $\mathbf{B}$ be an $r \times p$ matrix of constants. Derive a nice simple formula for $cov(\mathbf{Ax},\mathbf{Bx})$. \item Let $\mathbf{x}$ be a $p \times 1$ random vector with mean $\boldsymbol{\mu}_x$ and variance-covariance matrix $\mathbf{\Sigma}_x$, and let $\mathbf{y}$ be a $q \times 1$ random vector with mean $\boldsymbol{\mu}_y$ and variance-covariance matrix $\mathbf{\Sigma}_y$. Let $\mathbf{\Sigma}_{xy}$ denote the $p \times q$ matrix $ cov(\mathbf{x},\mathbf{y}) = E\left((\mathbf{x}-\boldsymbol{\mu}_x)(\mathbf{y}-\boldsymbol{\mu}_y)^\top\right)$. \begin{enumerate} \item What is the $(i,j)$ element of $\mathbf{\Sigma}_{xy}$? You don't need to show any work; just write down the answer. \item Find an expression for $cov(\mathbf{x}+\mathbf{y})$ in terms of $\mathbf{\Sigma}_x$, $\mathbf{\Sigma}_y$ and $\mathbf{\Sigma}_{xy}$. Show your work. \item Simplify further for the special case where $Cov(X_i,Y_j)=0$ for all $i$ and $j$. \item Let $\mathbf{c}$ be a $p \times 1$ vector of constants and $\mathbf{d}$ be a $q \times 1$ vector of constants. Find $ cov(\mathbf{x}+\mathbf{c}, \mathbf{y}+\mathbf{d})$. Show your work. \end{enumerate} \item Let $\mathbf{x}= (X_1,X_2,X_3)^\top$ be multivariate normal with \begin{displaymath} \boldsymbol{\mu} = \left( \begin{array}{c} 1 \\ 0 \\ 6 \end{array} \right) \mbox{ and } \boldsymbol{\Sigma} = \left( \begin{array}{c c c} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1 \end{array} \right) . \end{displaymath} Let $Y_1=X_1+X_2$ and $Y_2=X_2+X_3$. Find the joint distribution of $Y_1$ and $Y_2$. \item Let $X_1$ be Normal$(\mu_1, \sigma^2_1)$, and $X_2$ be Normal$(\mu_2, \sigma^2_2)$, independent of $X_1$. What is the joint distribution of $Y_1=X_1+X_2$ and $Y_2=X_1-X_2$? What is required for $Y_1$ and $Y_2$ to be independent? Hint: Use matrices. \item \label{quad} Show that if $\mathbf{w} \sim N_p(\boldsymbol{\mu},\boldsymbol{\Sigma})$ with $\boldsymbol{\Sigma}$ positive definite, $Y = (\mathbf{w}-\boldsymbol{\mu})^\top \boldsymbol{\Sigma}^{-1}(\mathbf{w}-\boldsymbol{\mu})$ has a chi-squared distribution with $p$ degrees of freedom. % UVN \item You know that if $\mathbf{w} \sim N_p(\boldsymbol{\mu},\boldsymbol{\Sigma} )$, then $\mathbf{Aw}+\mathbf{c} \sim N_r(\mathbf{A}\boldsymbol{\mu} + \mathbf{c}, \mathbf{A}\boldsymbol{\Sigma}\mathbf{A}^\top )$. Use this result to obtain the distribution of the sample mean under normal random sampling. That is, let $X_1, \ldots, X_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Find the distribution of $\overline{X}$. You might want to use $\mathbf{1}$ to represent an $n \times 1$ column vector of ones. \item Let $X_1, \ldots, X_n$ be independent and identically distributed random variables with $E(X_i)=\mu$ and $Var(X_i) = \sigma^2$. \begin{enumerate} \item Show $Cov(\overline{X},(X_j-\overline{X}))=0$ for $j=1, \ldots, n$. \item Why does this imply that if $X_1, \ldots, X_n$ are normal, $\overline{X}$ and $S^2$ are independent? \end{enumerate} \item Recall that the chi-squared distribution with $\nu$ degrees of freedom is just Gamma with $\alpha=\frac{\nu}{2}$ and $\beta=2$. So if $X\sim\chi^2(\nu)$, it has moment-generating function $M_X(t) = (1-2t)^{-\nu/2}$. \begin{enumerate} \item Let $W_1 \sim \chi^2(\nu_1)$ and $W_2 \sim \chi^2(\nu_2)$ be independent, and $W=W_1+W_2$. Find the distribution of $W$. Show your work (there's not much). \item \label{W1} Let $W=W_1+W_2$, where $W_1$ and $W_2$ are independent, $W\sim\chi^2(\nu_1+\nu_2)$ and $W_2\sim\chi^2(\nu_2)$, where $\nu_1$ and $\nu_2$ are both positive. Show $W_1\sim\chi^2(\nu_1)$. \item Let $X_1, \ldots, X_n$ be random sample from a $N(\mu,\sigma^2)$ distribution. Show \begin{displaymath} \frac{(n-1)S^2}{\sigma^2} \sim \chi^2(n-1). \end{displaymath} Hint: $\sum_{i=1}^n\left(X_i-\mu \right)^2 = \sum_{i=1}^n\left(X_i-\overline{X} + \overline{X} - \mu \right)^2 = \ldots$ \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Normal Regression, etc. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \pagebreak \item Let $\mathbf{y}=\mathbf{X} \boldsymbol{\beta}+\boldsymbol{\epsilon}$, where $\mathbf{X}$ is an $n \times p$ matrix of known constants, $\boldsymbol{\beta}$ is a $p \times 1$ vector of unknown constants, and $\boldsymbol{\epsilon}$ is multivariate normal with mean zero and covariance matrix $\sigma^2 \mathbf{I}_n$. The constant $\sigma^2 > 0$ is unknown. % In the following, it may be helpful to recall that $(\mathbf{A}^{-1})^\top=(\mathbf{A}^\top)^{-1}$. \begin{enumerate} \item The ``hat matrix" $\mathbf{H} = \mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1}\mathbf{X}^\top$. \begin{enumerate} \item What are the dimensions (number of rows and columns) of $\mathbf{H}$? \item Show $\mathbf{H}$ is symmetric. \item Show $\mathbf{H}$ is ``idempotent," meaning $\mathbf{HH} = \mathbf{H}$. \item Show $\mathbf{I-H}$ is symmetric. \item Show $\mathbf{I-H}$ is idempotent. \end{enumerate} \item What is the distribution of $\mathbf{y}$? \item The least squares (and maximum likelihood) estimate of $\boldsymbol{\beta}$ is $\widehat{\boldsymbol{\beta}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}$. What is the distribution of $\widehat{\boldsymbol{\beta}}$? Show the calculations. \item Let $\widehat{\mathbf{y}}=\mathbf{X}\widehat{\boldsymbol{\beta}}$. \begin{enumerate} \item Show $\widehat{\mathbf{y}} = \mathbf{Hy}$ \item What is the distribution of $\widehat{\mathbf{y}}$? Show the calculation. \end{enumerate} \item Let the vector of residuals $\mathbf{e} = \mathbf{y}-\widehat{\mathbf{y}}$. \begin{enumerate} \item Show $\mathbf{e} = (\mathbf{I-H}) \mathbf{y}$ \item What is the distribution of $\mathbf{e}$? Show the calculations. Simplify both the expected value and the covariance matrix. \end{enumerate} \item Using Problem~\ref{AxB}, show that $\mathbf{e}$ and $\widehat{\boldsymbol{\beta}}$ are independent. \item The least-squares (and maximum likelihood) estimator $\widehat{\boldsymbol{\beta}}$ is obtained by minimizing the sum of squares $Q = (\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^\top (\mathbf{y}-\mathbf{X}\boldsymbol{\beta})$ over all $\boldsymbol{\beta} \in \mathbb{R}^p$. \begin{enumerate} \item \label{2parts} Show that $Q = \mathbf{e}^\top\mathbf{e} + (\widehat{\boldsymbol{\beta}} - \boldsymbol{\beta})^\top (\mathbf{X}^\top \mathbf{X}) (\widehat{\boldsymbol{\beta}} - \boldsymbol{\beta})$. Hint: Add and subtract $\widehat{\mathbf{y}}$. \item Why does this imply that the minimum of $Q(\boldsymbol{\beta})$ occurs at $\boldsymbol{\beta} = \widehat{\boldsymbol{\beta}}$? \item The columns of $\mathbf{X}$ are linearly independent. Why does linear independence guarantee that the minimum is unique? You have just minimized a function of $p$ variables without calculus. \item Show that $W_1 = \frac{SSE}{\sigma^2} \sim \chi^2(n-p)$, where $SSE = \mathbf{e}^\top\mathbf{e}$. Use results from earlier parts of this assignment. Start with the distribution of $W = \frac{1}{\sigma^2}(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^\top (\mathbf{y}-\mathbf{X}\boldsymbol{\beta})$. This is the chi-squared random variable that appears in the denominator of all those $F$ and $t$ statistics. \end{enumerate} \end{enumerate} \end{enumerate} \end{document}