% 302f20Assignment2.tex   More review plus a little simple regression 
\documentclass[11pt]{article} 
%\usepackage{amsbsy} % for \boldsymbol and \pmb 
\usepackage{graphicx} % To include pdf files!
\usepackage{amsmath}
\usepackage{amsbsy}
\usepackage{amsfonts}
\usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links
\usepackage{fullpage}
%\pagestyle{empty} % No page numbers

\begin{document}
%\enlargethispage*{1000 pt} 

\begin{center}   
{\Large \textbf{STA 302f20 Assignment Two}}\footnote{This assignment was prepared by  \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner},
Department of Statistical Sciences, University of Toronto. It is licensed under a 
\href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US}
     {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website:
\href{http://www.utstat.toronto.edu/~brunner/oldclass/302f20} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/302f20}}}
\vspace{1 mm}
\end{center}

\noindent 
Please do these review questions in preparation for Quiz Two; they are not to be handed in.  Use the formula sheet on the course website. Starting with Problem~\ref{mgfstart}, you can play a little game. Try not to do the same work twice. Instead, use results of earlier problems whenever possible.

\vspace{3mm}

\begin{enumerate} 

\item Read Chapter 2 in \emph{Linear models in statistics}, optionally skipping Sections 2.8 (generalized inverses), 2.13 (idempotent matrices) and 2.14 (vector and matrix calculus). Do problems 2.6a, 2.6d, 2.7b, 2.7c, 2.17c, 2.17d, 2.20, 2.23, 2.24. You should be able to do these problems without reading anything, but the assigned reading will help, soon.

\item Let $\mathbf{A}$ by a non-singular square matrix. Prove that $\mathbf{A}^{-1}$ is unique by letting both $\mathbf{B}$ and $\mathbf{C}$ be inverses of $\mathbf{A}$, and then showing $\mathbf{B} = \mathbf{C}$.

\item This problem is more review, this time of statistical concepts you encountered in STA260 and probably STA258. Let $y_1, \ldots, y_n$ be a random sample (that is, independent and identically distributed) from a normal distribution with mean $\mu$ and variance $\sigma^2$, so that $t = \frac{\sqrt{n}(\overline{y}-\mu)}{s} \sim t(n-1)$. This is something you don't need to prove, for now. 
    \begin{enumerate}
        \item Derive a $(1-\alpha)100\%$ confidence interval for $\mu$. ``Derive" means show all the high school algebra. Use the symbol $t_{1-\alpha/2}$ for the number satisfying $Pr(T \leq t_{1-\alpha/2})= 1-\alpha/2$. 
        \item \label{ci} A random sample with $n=23$ yields $\overline{y} = 2.57$ and a sample variance of $s^2=5.85$. 
            \begin{enumerate} 
                \item Use R to find the critical value $t_{0.975}$.
                \item Give a 95\% confidence interval for $\mu$. The answer is a pair of numbers, the lower confidence limit and the upper confidence limit.
            \end{enumerate}
        \item Using the sample statistics from Question~\ref{ci}, test $H_0: \mu=3$ versus $H_1: \mu \neq 3$ at $\alpha=0.05$. 
            \begin{enumerate}
                \item Give the value of the $T$ statistic. The answer is a number.
                \item What is the critical value? The answer is a number.
                \item State whether you reject $H_0$, Yes or No.
                \item What is the $p$-value? Give the number and the R command that produced it.
                \item Can you conclude that $\mu$ is different from 3? Answer Yes or No.
                \item If the answer is Yes, state whether $\mu>3$ or $\mu<3$. Pick one.
            \end{enumerate}
    \end{enumerate}

\vspace{30mm}

\pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

    \item \label{mgfstart} Denote the moment-generating function of a random variable $y$ by $M_y(t)$. The moment-generating function is defined by $M_y(t) = E(e^{yt})$.
    \begin{enumerate}
        \item Let $a$ be a constant. Prove that $M_{ax}(t) = M_x(at)$.
        \item Prove that $M_{x+a}(t) = e^{at}M_x(t)$.
        \item Let $x_1, \ldots, x_n$ be \emph{independent} random variables. Prove that 
\begin{displaymath}
    M_{\sum_{i=1}^n x_i}(t) = \prod_{i=1}^n M_{x_i}(t).
\end{displaymath} 
Clearly indicate where you use independence.
    \end{enumerate} 



     \item Recall that if $x\sim N(\mu,\sigma^2)$, it has moment-generating function $M_x(t) = e^{\mu t + \frac{1}{2}\sigma^2t^2}$.  You will not have to prove this. 
    \begin{enumerate}
        \item Let $x\sim N(\mu,\sigma^2)$ and $y=ax+b$, where $a$ and $b$ are constants. Use moment-generating functions to find the distribution of $y$. Show your work. 
        \item Let $x\sim N(\mu,\sigma^2)$ and $z = \frac{x-\mu}{\sigma}$. Use moment-generating functions to find the distribution of $z$.  Show your work.        
        \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Use moment-generating functions to find the distribution of $y = \sum_{i=1}^nx_i$.  Show your work.
        \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Use moment-generating functions to find the distribution of the sample mean $\overline{x}$. 
        \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. Find the distribution of $z = \frac{\sqrt{n}(\overline{x}-\mu)}{\sigma}$. Show your work. 
        \item Let $x_1, \ldots, x_n$ be independent random variables, with $x_i \sim N(\mu_i,\sigma_i^2)$. Let $a_0, \ldots, a_n$ be constants.   
Use moment-generating functions to find the distribution of $y = a_0 + \sum_{i=1}^n a_ix_i$. Show your work. This is a big deal, because it establishes that any linear combinations of independent normals is normal. Thus, to find the distribution of any linear combination of independent normals, all you need to do is calculate the expected value and variance. 
    \end{enumerate}

    \item A Chi-squared random variable $x$ with parameter $\nu>0$ has moment-generating function $M_x(t) = (1-2t)^{-\nu/2}$ for $t < 1/2$. You will not have to prove this. 
    \begin{enumerate}
        \item Let $x_1, \ldots, x_n$ be independent random variables with $x_i \sim \chi^2(\nu_i)$ for $i=1, \ldots, n$. Find the distribution of $y = \sum_{i=1}^n x_i$. 
        \item Let $z \sim N(0,1)$. Find the distribution of $y=z^2$ using moment-generating functions.  For this one, you need to integrate. 
        % Recall that the density of a normal random variable is $f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$. You will still use moment-generating functions.
        \item Let $x_1, \ldots, x_n$ be random sample from a $N(\mu,\sigma^2)$ distribution. Find the distribution of $y = \frac{1}{\sigma^2} \sum_{i=1}^n\left(x_i-\mu \right)^2$. 
        \item Let $y=x_1+x_2$, where $x_1$ and $x_2$ are independent, $x_2\sim\chi^2(\nu_2)$ and $y\sim\chi^2(\nu_1+\nu_2)$, where $\nu_1$ and $\nu_2$ are both positive. Show $x_1\sim\chi^2(\nu_1)$. 
        \item Let $x_1, \ldots, x_n$ be random sample from a $N(\mu,\sigma^2)$ distribution. Show
\begin{displaymath}
    \frac{(n-1)s^2}{\sigma^2} \sim \chi^2(n-1),
\end{displaymath}
where $s^2 = \frac{\sum_{i=1}^n\left(x_i-\overline{x} \right)^2 }{n-1}$.
Hint: $\sum_{i=1}^n\left(x_i-\mu \right)^2 = \sum_{i=1}^n\left(x_i-\overline{x} + \overline{x} - \mu \right)^2 = \ldots$

For this question, you may use the independence of $\overline{x}$ and $s^2$ without proof. We will prove it later.  Note: This is a special case of a central result that will be used throughout most of the course.
    \end{enumerate}

% \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\item We return to simple linear regression (see problem 14 from last week). ``Simple" means that there is just one explanatory variable. Here's the model. Independently for $i=1, \ldots,n$, let $y_i = \beta_0 + \beta_1 x_i + \epsilon_i$, where $\beta_0$ and $\beta_1$ are unknown constants (parameters), $x_1, \ldots, x_n$ are a known, observable constants, and  $\epsilon_1, \ldots, \epsilon_n$ are independent random variables with expected value zero and unknown variance $\sigma^2$.

        \begin{enumerate}
            \item  \label{LS} In \emph{least squares} estimation, one first writes the expected value of $y_i$ as a function of $\boldsymbol{\beta} = (\beta_0,\beta_1)$, say $E_{\boldsymbol{\beta}}(y_i)$, and then one estimates the $\beta_j$ by choosing values that get the $y_i$ as close as possible to their expected values, in the sense of minimizing $Q(\boldsymbol{\beta}) = \sum_{i=1}^n(y_i - E_{\boldsymbol{\beta}}(y_i))^2$ over all $\boldsymbol{\beta}$ values. Following this recipe, obtain formulas for the least squares estimates of $\beta_0$ and $\beta_1$. Don't bother with second derivative tests. There is a better way to verify that you have found the minimum; we will cover it later.
        
            \item Suppose the $\epsilon_i$ are normally distributed. Using results from earlier in this assignment, what is the distribution of $y_i$?
            
            \item Starting from $\widehat{\beta}_1 = \frac{\sum_{i=1}^n(x_i-\overline{x})(y_i-\overline{y})}
           {\sum_{i=1}^n(x_i-\overline{x})^2}$, show $\widehat{\beta}_1 = \frac{\sum_{i=1}^n(x_i-\overline{x})y_i}
           {\sum_{i=1}^n(x_i-\overline{x})^2}$. 
           
            \item Using the preceding result,
                 \begin{enumerate}
                    \item What is the distribution of $\widehat{\beta}_1$ if the $\epsilon_i$ are normal?
                    \item What is $Cov(\overline{y},\widehat{\beta}_1)$?
                    \item What is the distribution of $\widehat{\beta}_0 =  \overline{y} - \widehat{\beta}_1\overline{x}$ if the $\epsilon_i$ are normal?
                    \item What is $Cov(\widehat{\beta}_0,\widehat{\beta}_1)$? 
                 \end{enumerate}

            \item Calculate $\widehat{\beta}_0$ and $\widehat{\beta}_1$ for the following data set. Your answers are numbers. Use R. You might be asked to use R on the quiz.
\begin{verbatim}
x  0.0  1.3  3.2 -2.5 -4.6 -1.6  4.5  3.8
y -0.8 -1.3  7.4 -5.2 -6.5 -4.9  9.9  7.2
\end{verbatim} 

        \end{enumerate} % End simple regression question

\end{enumerate} % End all the questions

% \vspace{130mm}


% How about distribution of y-hat?

\end{document}

% In Chapter 2, 2.3 part 2.4 rank 2.5 2.6 pos def 2.7 systems of eq, 2.8 generalized inverse skip, 2.9 det, 2.10, orthog mat, 2.11 trace, 2.12 eigen, 2.13 idempotent skip, 2.14 matrix derivatives skip. 

\hrule \vspace{3mm} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

# 1 (2.6a)

A = rbind(c(8,3,7),
          c(-2,5,-3))
B = rbind(c(-2,5),
          c(3,7),
          c(6,-4))
A %*% B
B %*% A





# 7e
x = c( 0.0,  1.3,  3.2, -2.5, -4.6, -1.6,  4.5,  3.8)
y = c(-0.8, -1.3,  7.4, -5.2, -6.5, -4.9,  9.9,  7.2)
ybar = mean(y); xbar = mean(x); ss = sum((x-xbar)^2)
beta1hat = sum((x-xbar)*(y-ybar))/ss; beta0hat = ybar - beta1hat*xbar
c(beta0hat,beta1hat)
lm(y~x)  # Check
# Note it's curvy -- plot
--------------------------------

> 
> x = c( 0.0,  1.3,  3.2, -2.5, -4.6, -1.6,  4.5,  3.8)
> y = c(-0.8, -1.3,  7.4, -5.2, -6.5, -4.9,  9.9,  7.2)
> ybar = mean(y); xbar = mean(x); ss = sum((x-xbar)^2)
> beta1hat = sum((x-xbar)*(y-ybar))/ss; beta0hat = ybar - beta1hat*xbar
> c(beta0hat,beta1hat)
[1] -0.2497055  1.9018644
> lm(y~x)  # Check

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
    -0.2497       1.9019  

>