\documentclass[11pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb \usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links %\usepackage{fullpage} %\pagestyle{empty} % No page numbers \oddsidemargin=0in % Good for US Letter paper \evensidemargin=0in \textwidth=6.5in \topmargin=-0.8in \headheight=0in \headsep=0.5in \textheight=9.4in \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 2101/442 Assignment 4}}\footnote{This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistics, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/appliedf17} {\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/appliedf17}}} \vspace{1 mm} \end{center} \noindent Except for Question \ref{integrateR}, the questions on this assignment are practice for the quiz on Friday October 6th, and are not to be handed in. Please do the problems using the formula sheet as necessary. A copy of the formula sheet will be distributed with the quiz. \begin{enumerate} \item It's easy to say ``All these dummy variable coding schemes are equivalent," and the statement is correct --- but exactly what does it mean? Consider the example of a 3-category explanatory variable with categories labelled $A$, $B$ and $C$, and a single quantitative explanatory variable. This can be extended to cover most cases of interest. We have seen two ways of setting up the dummy variables; there are plenty more. For indicator dummy variables with intercept, $Y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \epsilon$, where $x_1$ and $x_2$ are indicators for categories $A$ and $B$ respectively and $x_3$ is the quantitative variable. For cell means coding, $Y = \alpha_1w_1 + \alpha_2w_2 + \alpha_3w_3 + \alpha_4x_3 + \epsilon$, where $w_1$, $w_2$ and $w_3$ are indicators for categories $A$, $B$ and $C$ respectively and $x_3$ is the same quantitative explanatory variable. This notation is a reminder that when the dummy variable coding changes, the meaning of some parameters will change too. \begin{enumerate} \item For each of the two coding schemes, make a table showing how the dummy variables are set up. There should be one row for each category, and a column for each dummy variable. Add a another column on the right, showing $E(Y|x)$. I know you have done this before, but it will help. Put the two tables side by side. \item Clearly if you know $x_1$ and $x_2$, you know $w_1$, $w_2$ and $w_3$ -- and vice versa\footnote{For the model with an intercept, there is actually another dummy variable $x_0$ that always equals one. Thus there are 3 dummy variables in each set.}. The same is true of the regression coefficients. Solve for $\alpha_1, \alpha_2, \alpha_3$ and $\alpha_4$ in terms of $\beta_0, \beta_1, \beta_2$ and $\beta_3$. Your answer consists of four equations. \item Note that the equations are linear, and it would be easy to solve for the $\beta$ parameters in terms of the $\alpha$ parameters. Thus the re-parameterization of the vector $\boldsymbol{\beta}$ into the vector $\boldsymbol{\alpha}$ is a $1-1$ linear transformation. That is, $\boldsymbol{\alpha} = \mathbf{A}\boldsymbol{\beta}$, where the matrix $\mathbf{A}$ has an inverse. Give the $4 \times 4$ matrix $\mathbf{A}$. The answer is a matrix of specific numbers (integers). \item For the general linear model $\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}$, the one-to-one linear re-parameterization $\boldsymbol{\alpha} = \mathbf{A}\boldsymbol{\beta}$ requires a one-to-one linear transformation of the $\mathbf{X}$ matrix in order not to change what the model says: \begin{eqnarray*} \mathbf{y} & = & \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon} \\ & = & \mathbf{XA}^{-1}\mathbf{A} \boldsymbol{\beta} + \boldsymbol{\epsilon} \\ & = & \mathbf{W} \boldsymbol{\alpha} + \boldsymbol{\epsilon}, \end{eqnarray*} where $\mathbf{W} = \mathbf{XA}^{-1}$ and $\boldsymbol{\alpha} = \mathbf{A}\boldsymbol{\beta}$. This is very general, and is not confined to adopting different dummy variable codings. One can start with a linear transformation of $\mathbf{X}$ or with a linear re-parameterization of $\boldsymbol{\beta}$. One requires the other. Finally, here is the question. Give the $4 \times 4$ matrix $\mathbf{A}^{-1}$ for our little dummy variable problem. The answer is a matrix of specific numbers. I actually used R's \texttt{solve} function to get it. If you do it this way, don't bother to bring that printout. \item Give equations for $w_1$, $w_2$ and $w_3$ in terms of $x_0$, $x_1$ and $x_2$. Your answer consists of three scalar equations. This confirms that switching dummy variable coding is a linear transformation for this example. \item Now we move to a more general setting in which $\boldsymbol{\alpha} = \mathbf{A}\boldsymbol{\beta}$ is just a 1-1 linear re-parameterization and $\mathbf{W} = \mathbf{XA}^{-1}$ is the corresponding transformation of the explanatory variables. \item Write the least-squares estimate $\widehat{\boldsymbol{\alpha}}$ in terms of $\widehat{\boldsymbol{\beta}}$. Show the calculation. I think it's easiest to start with $\widehat{\boldsymbol{\alpha}}$ and substitute. \item Call $\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}$ the \emph{original} model, and $\mathbf{y} = \mathbf{W} \boldsymbol{\alpha} + \boldsymbol{\epsilon}$ the \emph{re-parameterized} model. Compare the vector of predicted $y$ values $\widehat{\mathbf{y}}$ from the re-parameterized model to $\widehat{\mathbf{y}}$ from the original model. \item Compare the vector of residuals $\mathbf{e}$ from the re-parameterized model to $\mathbf{e}$ from the original model. \item Compare the proportion of explained variation $R^2$ from the original and re-parameterized models. \item Consider the null hypothesis $H_0: \mathbf{L}\boldsymbol{\beta} = \mathbf{h}$ based on the original model. The corresponding (logically equivalent, if and only if) null hypothesis for the re-parameterized model is $H_0: \mathbf{K}\boldsymbol{\alpha} = \mathbf{h}$. Give a formula for $\mathbf{K}$. Show a little work. \item Compare the $F$ statistic for testing $H_0: \mathbf{L}\boldsymbol{\beta} = \mathbf{h}$ to the $F$ statistic for testing $H_0: \mathbf{K}\boldsymbol{\alpha} = \mathbf{h}$. Start with the formula for the second one, and then substitute. Show your work. Use the formula sheet. \end{enumerate} The overall story is that (all) these dummy variable schemes are equivalent in the sense that they lead to the same predictions and the same conclusions. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Large sample %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \item Suppose $\sqrt{n}(T_n-\theta) \stackrel{d}{\rightarrow} T$. Show $T_n \stackrel{p}{\rightarrow} \theta$. Please use Slutsky lemmas rather than definitions. \item Let $X_1 , \ldots, X_n$ be a random sample from a Binomial distribution with parameters $3$ and $\theta$. That is, \begin{displaymath} P(X_i = x_i) = \binom{3}{x_i} \theta^{x_i} (1-\theta)^{3-x_i}, \end{displaymath} for $x_i=0,1,2,3$. Find the maximum likelihood estimator of $\theta$, and show that it is strongly consistent. \item Let $X_1 , \ldots, X_n$ be a random sample from a continuous distribution with density \begin{displaymath} f(x;\tau) = \frac{\tau^{1/2}}{\sqrt{2\pi}} \, e^{-\frac{\tau x^2}{2}}, \end{displaymath} where the parameter $\tau>0$. Let \begin{displaymath} \widehat{\tau} = \frac{n}{\sum_{i=1}^n X_i^2}. \end{displaymath} Is $ \widehat{\tau}$ a consistent estimator of $\tau$? Answer Yes or No and prove your answer. Hint: You can just write down $E(X^2)$ by inspection. This is a very familiar distribution. \item Let $X_1, \ldots, X_n$ be a random sample from a distribution with mean $\mu$. Show that $T_n = \frac{1}{n+400}\sum_{i=1}^n X_i$ is a strongly consistent estimator of $\mu$. % That could be a quiz Q \item Let $X_1, \ldots, X_n$ be a random sample from a distribution with mean $\mu$ and variance $\sigma^2$. Prove that the sample variance $S^2=\frac{\sum_{i=1}^n(X_i-\overline{X})^2}{n-1}$ is a strongly consistent estimator of $\sigma^2$. \item \label{randiv} Independently for $i = 1 , \ldots, n$, let \begin{displaymath} Y_i = \beta X _i + \epsilon_i, \end{displaymath} where $E(X_i)=E(\epsilon_i)=0$, $Var(X_i)=\sigma^2_X$, $Var(\epsilon_i)=\sigma^2_\epsilon$, and $\epsilon_i$ is independent of $X_i$. Let \begin{displaymath} \widehat{\beta}_n = \frac{\sum_{i=1}^n X_i Y_i}{\sum_{i=1}^n X_i^2}. \end{displaymath} Is $ \widehat{\beta}_n$ a consistent estimator of $\beta$? Answer Yes or No and prove your answer. \item In this problem, you'll use (without proof) the \emph{variance rule}, which says that if $\theta$ is a real constant and $T_1, T_2, \ldots$ is a sequence of random variables with \begin{displaymath} \lim_{n \rightarrow \infty} E(T_n) = \theta \mbox{ and } \lim_{n \rightarrow \infty} Var(T_n) = 0, \end{displaymath} then $ T_n \stackrel{P}{\rightarrow} \theta$. In Problem~\ref{randiv}, the independent variables are random. Here they are fixed constants, which is more standard (though a little strange if you think about it). Accordingly, let \begin{displaymath} Y_i = \beta x_i + \epsilon_i \end{displaymath} for $i=1, \ldots, n$, where $\epsilon_1, \ldots, \epsilon_n$ are a random sample from a distribution with expected value zero and variance $\sigma^2$, and $\beta$ and $\sigma^2$ are unknown constants. \begin{enumerate} \item What is $E(Y_i)$? \item What is $Var(Y_i)$? % \item Find the Least Squares estimate of $\beta$ by minimizing % $Q=\sum_{i=1}^n(Y_i-\beta x_i)^2$ over all values of $\beta$. Let $\widehat{\beta}_n$ denote the point at which $Q$ is minimal. \item Use the same estimator as in Problem~\ref{randiv}. Is $\widehat{\beta}_n$ unbiased? Answer Yes or No and show your work. \item Suppose that the sequence of constants $\sum_{i=1}^nx_i^2 \rightarrow \infty$ as $n \rightarrow \infty$. Does this guarantee $\widehat{\beta}_n$ will be consistent? Answer Yes or No. Show your work. % There are other conditions on the $x_i$ values that work, but this might be the best. \item Let $\widehat{\beta}_{2,n} = \frac{\overline{Y}_n}{\overline{x}_n}$. Is $\widehat{\beta}_{2,n}$ unbiased? Consistent? Answer Yes or No to each question and show your work. Do you need a condition on the $x_i$ values ? \item Prove that $\widehat{\beta}_n$ is a more accurate estimator than $\widehat{\beta}_{2,n}$ in the sense that it has smaller variance. Hint: The sample variance of the explanatory variable values cannot be negative. \end{enumerate} \item Let $X$ be a random variable with expected value $\mu$ and variance $\sigma^2$. Show $\frac{X}{n} \stackrel{p}{\rightarrow} 0$. \item Let $X_1 , \ldots, X_n$ be a random sample from a Gamma distribution with $\alpha=\beta=\theta>0$. That is, the density is \begin{displaymath} f(x;\theta) = \frac{1}{\theta^\theta \Gamma(\theta)} e^{-x/\theta} x^{\theta-1}, \end{displaymath} for $x>0$. Let $\widehat{\theta} = \overline{X}_n$. Is $ \widehat{\theta}$ a consistent estimator of $\theta$? Answer Yes or No and prove your answer. \pagebreak \item The ordinary univariate Central Limit Theorem says that if $X_1, \ldots, X_n$ are a random sample (independent and identically distributed) from a distribution with expected value $\mu$ and variance $\sigma^2$, then \begin{displaymath} Z_n^{(1)} = \frac{\sqrt{n}(\overline{X}_n-\mu)}{\sigma} \stackrel{d}{\rightarrow} Z \sim N(0,1). \end{displaymath} An application of some Slutsky theorems (see lecture slides) shows that also, \begin{displaymath} Z_n^{(2)} = \frac{\sqrt{n}(\overline{X}_n-\mu)}{\widehat{\sigma}_n} \stackrel{d}{\rightarrow} Z \sim N(0,1), \end{displaymath} where $\widehat{\sigma}_n$ is any consistent estimator of $\sigma$. For this problem, suppose that $X_1, \ldots, X_n$ are Bernoulli($\theta$). \begin{enumerate} \item What is $\mu$? \item What is $\sigma^2$? \item Re-write $Z_n^{(1)}$ for the Bernoulli exanple. \item What about $ Z_n = \frac{\sqrt{n}(\overline{X}_n-\theta)} {\sqrt{\overline{X}_n(1-\overline{X}_n)}}$? Does $Z_n$ converge in distribution to a standard normal? Why or why not? \item What about the $t$ statistic $T_n = \frac{\sqrt{n}(\overline{X}_n-\mu)}{S_n}$, where $S_n$ is the sample standard deviation? Does $T_n$ converge in distribution to a standard normal? Why or why not? \end{enumerate} \item \label{integrateR} Here is an integral you cannot do in closed form, and numerical integration is challenging. For example, R's \texttt{integrate} function fails. \begin{displaymath} \int_0^{1/2} e^{\cos(1/x)} \, dx \end{displaymath} Using R, approximate the integral with Monte Carlo integration, and give a 99\% confidence interval for your answer. You need to produce 3 numbers: the estimate, a lower confidence limit and an upper confidence limit. \textbf{Please bring your printout to the quiz}. \end{enumerate} \end{document} %%%%%%%%%%%% Univariate Delta Method problems - see 2013 %%%%%%%%%%