% 302f16Assignment2.tex Mostly Ch. 1 in Muni's book \documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb \usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{fullpage} %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{STA 302f17 Assignment Two}}\footnote{Copyright information is at the end of the last page.} \vspace{1 mm} \end{center} \noindent Please bring your R printout from Question~\ref{R} to Quiz Two; you may be asked to hand it in, or maybe not. The other problems are preparation for the quiz in tutorial, and are not to be handed in. Starting with Problem~\ref{mgfstart}, you can play a little game. Try not to do the same work twice. Instead, use results of earlier problems whenever possible. \vspace{3mm} \begin{enumerate} %%%%%%%%%%%%%%%% Stats \item This problem is more review, this time of statistical concepts you likely encountered in STA258. Let $y_1, \ldots, y_n$ be a random sample\footnote{Random sample means independent and identically distributed.} from a normal distribution with mean $\mu$ and variance $\sigma^2$, so that $T = \frac{\sqrt{n}(\overline{y}-\mu)}{S} \sim t(n-1)$. This is something you don't need to prove, for now. \begin{enumerate} \item Derive a $(1-\alpha)100\%$ confidence interval for $\mu$. ``Derive" means show all the high school algebra. Use the symbol $t_{\alpha/2}$ for the number satisfying $Pr(T>t_{\alpha/2})= \alpha/2$. \item \label{ci} A random sample with $n=23$ yields $\overline{y} = 2.57$ and a sample variance of $S^2=5.85$. Using the critical value $t_{0.025}=2.07$, give a 95\% confidence interval for $\mu$. The answer is a pair of numbers, the lower confidence limit and the upper confidence limit. Please \textbf{bring a calculator to the quiz} in case you have to do something like this. \item Using the numbers from Question~\ref{ci}, test $H_0: \mu=3$ at $\alpha=0.05$. \begin{enumerate} \item Give the value of the $T$ statistic. The answer is a number. \item State whether you reject $H_0$, Yes or No. \item Can you conclude that $\mu$ is different from 3? Answer Yes or No. \item If the answer is Yes, state whether $\mu>3$ or $\mu<3$. Pick one. \end{enumerate} \end{enumerate} %%%%%%%%%%%%%%%% Text on simple regression \item In the textbook \emph{Regression Analysis}, please read pages 1-7 for general ideas. Then read Section 1.4 on pages 7-9. The chapter has an appendix with some derivations, too. \begin{enumerate} \item In formula (1.5), assume $E(\epsilon_i) = 0$. What is $E(y_i)$? \item Partially differentiate expression (1.8) to obtain formulas (1.11) and (1.13) for $b_0$ and $b_1$ on page 8. \item Prove that (1.13) and (1.14) are equal. \item On p.~9, the text says $\sum_{i=1}^n(x_i-\overline{x}) = 0$. Prove it. \item For the centered model (1.15), partially differentiate to obtain the the least squares estimates of $\gamma_0$ and $\beta_1$. Is the least-squares estimate of the slope affected by centering? \item For the simple regression model (1.5), show that the residuals add to zero as claimed on p.~9.. \item Do Exercise 1.6. See the definition of least squares estimation in the lecture slides. \item \label{R} Do Exercise 1.9 using R. I got the data in using the \texttt{c()} function -- \texttt{c} for collect. The problem is asking you to calculate $b_0$ and $b_1$. \textbf{Bring your printout to the quiz. You may be asked to hand it in.} Note that while the textbook gives $\log(y)$ rounded to two decimal places, you don't need to round if you are using R -- so please don't round. When the text says $\log(y)$, does this mean the natural log, or log base ten? One more comment is that if you plot $x$ versus $y$ (not requested by this question), you see a clearly curvy relationship, while a plot of $x$ versus $\log(y)$ is very close to a straight line. Transformation of $y$ is a good curve-fitting trick. \end{enumerate} \item Please read Section 1.5 in the textbook \emph{Regression Analysis}. Assume $E(\epsilon_i) = 0$. \begin{enumerate} \item In formula (1.20), what is $E(y_i)$? \item Obtain the least squares estimator of $\beta_1$ for this model. Show all your work including the second derivative test. \item Do Exercise 1.1. \item What is $\sum_{i=1}^n e_i$ for this model? Must it be equal to zero? Answer Yes or No. \item What is $E(e_i)$? What is $E(\sum_{i=1}^n e_i)$? \end{enumerate} % \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%% MGF %%%%%%%%%%%%%%%%%%%%%%%%% \item \label{mgfstart} Denote the moment-generating function of a random variable $y$ by $M_y(t)$. The moment-generating function is defined by $M_y(t) = E(e^{yt})$. \begin{enumerate} \item Let $a$ be a constant. Prove that $M_{ax}(t) = M_x(at)$. \item Prove that $M_{x+a}(t) = e^{at}M_x(t)$. \item Let $x_1, \ldots, x_n$ be \emph{independent} random variables. Prove that \begin{displaymath} M_{\sum_{i=1}^n x_i}(t) = \prod_{i=1}^n M_{x_i}(t). \end{displaymath} For convenience, you may assume that $x_1, \ldots, x_n$ are all continuous, so you will integrate. This is not the way I did it in class. \end{enumerate} \item Recall that if $x\sim N(\mu,\sigma^2)$, it has moment-generating function $M_x(t) = e^{\mu t + \frac{1}{2}\sigma^2t^2}$. \begin{enumerate} \item Let $x\sim N(\mu,\sigma^2)$ and $y=ax+b$, where $a$ and $b$ are constants. Use moment-generating functions to find the distribution of $y$. Show your work. \item Let $x\sim N(\mu,\sigma^2)$ and $z = \frac{x-\mu}{\sigma}$. Find the distribution of $z$. Show your work. \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Use moment-generating functions to find the distribution of $y = \sum_{i=1}^nx_i$. Show your work. \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Use moment-generating functions to find the distribution of the sample mean $\overline{x}$. Show your work. \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Find the distribution of $z = \frac{\sqrt{n}(\overline{x}-\mu)}{\sigma}$. Show your work. \item Let $x_1, \ldots, x_n$ be independent random variables, with $x_i \sim N(\mu_i,\sigma_i^2)$. Let $a_1, \ldots, a_n$ be constants. Use moment-generating functions to find the distribution of $y = \sum_{i=1}^n a_ix_i$. Show your work. \end{enumerate} \item For the model of formula (1.20) in the text, suppose that the $\epsilon_i$ are normally distributed and independent, which is the usual assumption. What is the distribution of $y_i$? What is the distribution of $b_1$? Use earlier work to obtain the answers without directly using moment-generating functions. % \pagebreak \item A Chi-squared random variable $x$ with parameter $\nu>0$ has moment-generating function $M_x(t) = (1-2t)^{-\nu/2}$. \begin{enumerate} \item Let $x_1, \ldots, x_n$ be independent random variables with $x_i \sim \chi^2(\nu_i)$ for $i=1, \ldots, n$. Find the distribution of $y = \sum_{i=1}^n x_i$. \item Let $z \sim N(0,1)$. Find the distribution of $y=z^2$. For this one, you need to integrate. Recall that the density of a normal random variable is $f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$. You will still use moment-generating functions. \item Let $x_1, \ldots, x_n$ be random sample from a $N(\mu,\sigma^2)$ distribution. Find the distribution of $y = \frac{1}{\sigma^2} \sum_{i=1}^n\left(x_i-\mu \right)^2$. \item Let $y=x_1+x_2$, where $x_1$ and $x_2$ are independent, $x_2\sim\chi^2(\nu_2)$ and $y\sim\chi^2(\nu_1+\nu_2)$, where $\nu_1$ and $\nu_2$ are both positive. Show $x_1\sim\chi^2(\nu_1)$. \item Let $x_1, \ldots, x_n$ be random sample from a $N(\mu,\sigma^2)$ distribution. Show \begin{displaymath} \frac{(n-1)s^2}{\sigma^2} \sim \chi^2(n-1), \end{displaymath} where $s^2 = \frac{\sum_{i=1}^n\left(x_i-\overline{x} \right)^2 }{n-1}$. Hint: $\sum_{i=1}^n\left(x_i-\mu \right)^2 = \sum_{i=1}^n\left(x_i-\overline{x} + \overline{x} - \mu \right)^2 = \ldots$ For this question, you may use the independence of $\overline{x}$ and $s^2$ without proof. We will prove it later. Note: This is a special case of a central result that will be used throughout most of the course. \end{enumerate} \end{enumerate} % \vspace{30mm} \noindent \begin{center}\begin{tabular}{l} \hspace{6in} \\ \hline \end{tabular}\end{center} This assignment was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistical Sciences, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/302f17} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/302f17}} \end{document}