%                                   302f20Assignment6.tex
\documentclass[12pt]{article} 
%\usepackage{amsbsy} % for \boldsymbol and \pmb 
\usepackage{graphicx} % To include pdf files!
\usepackage{amsmath}
\usepackage{amsbsy}
\usepackage{amsfonts}
\usepackage{comment}
\usepackage{euscript} % for \EuScript
\usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links
\usepackage{fullpage}
%\pagestyle{empty} % No page numbers


\begin{document}
%\enlargethispage*{1000 pt} 

\begin{center}   
{\Large \textbf{STA 302f20 Assignment Six}\footnote{This assignment was prepared by  \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner},
Department of Statistical Sciences, University of Toronto. It is licensed under a 
\href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US}
     {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website:
\href{http://www.utstat.toronto.edu/~brunner/oldclass/302f20} {\small\texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/302f20}}} }
\vspace{1 mm}
\end{center}

\noindent
The following problems are not to be handed in. They are preparation for the Quiz in tutorial and the final exam. Please try them before looking at the answers. Use the formula sheet.  % Please remember that the R parts (Questions~\ref{sat} and~\ref{faraway} are \emph{not group projects}. You may compare numerical answers, but do not show anyone your code or look at anyone else's.

\begin{enumerate} 

\item True or False: The sum of residuals is always equal to zero. Either prove that the statement is true, or use R to produce an example showing it is not true in general.

\item True or False: The sum of \emph{expected} residuals is always equal to zero. Either prove that the statement is true, or use R to produce an example showing it is not true in general.

\item True or False: The sum of residuals is always equal to zero if the model has an intercept. Either prove that the statement is true, or use R to produce an example showing it is not true in general.

\item Sometimes one can learn by just playing around. Suppose we fit a regression model, obtaining $\widehat{\boldsymbol{\beta}}$, $\widehat{\mathbf{y}}$, $\widehat{\boldsymbol{\epsilon}}$ and so on. Then we fit another regression model with the same predictor variables, but this time using  $\widehat{\mathbf{y}}$ as the predicted variable instead of $\mathbf{y}$.
    \begin{enumerate}
        \item Denote the vector of estimated regression coefficients from the new model by $\widehat{\widehat{\boldsymbol{\beta}}}$. Calculate $\widehat{\widehat{\boldsymbol{\beta}}}$ and simplify. Should you be surprised at this answer?
        \item Calculate $\widehat{\widehat{\mathbf{y}}}$. Why is this not surprising if you think in terms of projections? 
    \end{enumerate}

\item Now do the same thing as in the preceding question, but with $\widehat{\boldsymbol{\epsilon}}$ as the predicted variable. Can you understand this in terms of projections?

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% MVN via MGF %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
% \pagebreak

\item The joint moment-generating function of a $p$-dimensional random vector $\mathbf{x}$ is defined as $M_{\mathbf{x}}(\mathbf{t}) = E\left(e^{\mathbf{t}^\prime \mathbf{x}} \right)$. 
    \begin{enumerate}
        \item Let $\mathbf{y} = \mathbf{Ax}$, where $\mathbf{A}$ is a matrix of constants. Find the moment-generating function of $\mathbf{y}$.
        \item Let $\mathbf{y} = \mathbf{x} + \mathbf{c}$, where $\mathbf{c}$ is a $p \times 1$ vector of constants. Find the moment-generating function of $\mathbf{y}$.
    \end{enumerate}

\item Let the random vector 
$    \mathbf{x} = \left(\begin{array}{c} \mathbf{x}_1 \\ \hline \mathbf{x}_2 
                            \end{array}\right)$ and the vector of constants
$    \mathbf{t} = \left(\begin{array}{c} \mathbf{t}_1 \\ \hline \mathbf{t}_2 
                            \end{array}\right)$, with
$\mathbf{t}_1$ the same length as $\mathbf{x}_1$, and $\mathbf{t}_2$ the same length as $\mathbf{x}_2$. Let $\mathbf{x}_1$ and $\mathbf{x}_2$ be independent, and for convenience also assume they have joint densities. Show that the joint moment-generating function 
$M_{\mathbf{x}}(\mathbf{t}) = M_{\mathbf{x}_1}(\mathbf{t}_1) M_{\mathbf{x}_2}(\mathbf{t}_2)$.
 
\item Let $w_1$ and $w_2$ be independent \emph{scalar} random variables. Using moment-generating functions, show that $y_1 = g_1(w_1)$ and $y_2 = g_2(w_2)$ are independent. For convenience, you may assume that all the random variables have densities. 

% Degenerate RV?

\item Let $y$ be a \emph{degenerate} random variable, with $P(y=\mu)=1$. 
    \begin{enumerate}
        \item Find the moment-generating function of $y$.
        \item In what sense is $y$ normally distributed?
    \end{enumerate}

\item Let $\mathbf{x} = \left(\begin{array}{c} x_1 \\  x_2 
                            \end{array}\right) 
                        \sim N_2(\boldsymbol{\mu},\boldsymbol{\Sigma})$, with
$\boldsymbol{\mu} = \left(\begin{array}{c} 1 \\ 5 
                            \end{array}\right)$ and 
$\boldsymbol{\Sigma} = \left(\begin{array}{cc}
                    1 & 2 \\ 
                    2 & 4   
                    \end{array}\right)$. 
Give non-zero constants $a_1$ and $a_2$ such that $y = a_1x_1+a_2x_2$ has a degenerate distribution. Use moment-generating functions to show that the distribution is degenerate. This makes the joint distribution of $\mathbf{x}$ a \emph{singular} multivariate normal.

\item Let $z_1, \ldots, z_p \stackrel{i.i.d.}{\sim}N(0,1)$, and 
    \begin{displaymath}
    \mathbf{z} = \left(
                 \begin{array}{c}
                 z_1 \\ \vdots \\ z_p
                 \end{array}
                 \right).
    \end{displaymath} 
    \begin{enumerate}
        \item What is $E(\mathbf{z})$?
        \item What is $cov(\mathbf{z})$?
        \item What is the joint moment-generating function of $\mathbf{z}$? Show your work. 
        \item Let $\mathbf{y} = \boldsymbol{\Sigma}^{1/2}\mathbf{z} + \boldsymbol{\mu}$, where $\boldsymbol{\Sigma}$ is a $p \times p$ symmetric \emph{non-negative definite} matrix  and $\boldsymbol{\mu} \in \mathbb{R}^p$. 
            \begin{enumerate}
                \item What is $E(\mathbf{y})$?
                \item What is the variance-covariance matrix of $\mathbf{y}$? Show some work.
                \item What is the moment-generating function of $\mathbf{y}$? Show your work.
            \end{enumerate}
    \end{enumerate}

\item Let $\mathbf{y} \sim N_2(\boldsymbol{\mu}, \boldsymbol{\Sigma})$, with
\begin{displaymath}
    \mathbf{y} = \left(\begin{array}{c} y_1 \\  y_2 
                            \end{array}\right) ~~~~~
    \boldsymbol{\mu} = \left(\begin{array}{c} \mu_1 \\ \mu_2 
                            \end{array}\right) ~~~~~
    \boldsymbol{\Sigma} = \left(\begin{array}{cc}
        \sigma^2_1 & 0 \\ 
        0 & \sigma^2_2   
    \end{array}\right) 
\end{displaymath}
Using moment-generating functions, show $y_1$ and $y_2$ are independent. This is very similar to how the calculation goes for the full multivariate case.

\item Let $x_1 \sim N(1,1)$, $x_2 \sim N(0,2)$ and $x_3 \sim N(6,1)$ be independent random variables, and let $y_1=x_1+x_2$ and $y_2=x_2+x_3$. Find the joint distribution of $y_1$ and $y_2$.

\item Let $x_1$ be Normal$(\mu_1, \sigma^2_1)$, and $x_2$ be Normal$(\mu_2, \sigma^2_2)$, independent of $x_1$. What is the joint distribution of $y_1=x_1+x_2$ and $y_2=x_1-x_2$? What is required for $y_1$ and $y_2$ to be independent? % Hint: Use matrices.  

\item \label{linear_trans} Let $\mathbf{y} \sim N_p(\boldsymbol{\mu}, \boldsymbol{\Sigma})$ and $\mathbf{w}=\mathbf{Ay+c}$, where $\mathbf{A}$ is an $r \times p$ matrix of constants and $\mathbf{c}$ is an $r \times 1$ vector of constants. What is the distribution of $\mathbf{w}$? Prove your answer using moment-generating functions. 

You have shown that any linear transformation of a multivariate normal is multivariate normal. This is useful because from now on, if you observe that some random vector is a linear transformation of a multivariate normal, you don't need moment-generating functions to find its distribution. Just calculate the expected value and covariance matrix. 

% \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\item Here are some distribution facts that you will need to know without looking at a formula sheet in order to follow the proofs. You are responsible for the proofs of these facts too, but here you are just supposed to write down the answers.
    \begin{enumerate}
        \item Let $x\sim N(\mu,\sigma^2)$ and $y=ax+b$, where $a$ and $b$ are constants. What is the distribution of $y$? 
        \item Let $x\sim N(\mu,\sigma^2)$ and $z = \frac{x-\mu}{\sigma}$. What is the distribution of $z$? 
        \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. What is the distribution of $y = \sum_{i=1}^nx_i$?
        \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. What is the distribution of the sample mean $\overline{x}$?
        \item Let $x_1, \ldots, x_n$ be a random sample from a $N(\mu,\sigma^2)$ distribution. What is the distribution of $z = \frac{\sqrt{n}(\overline{x}-\mu)}{\sigma}$?
        \item Let $x_1, \ldots, x_n$ be independent random variables, with $x_i \sim N(\mu_i,\sigma_i^2)$. Let $a_0, \ldots, a_n$ be constants. What is the distribution of $y = a_0 + \sum_{i=1}^n a_ix_i$?
        \item Let $x_1, \ldots, x_n$ be independent random variables with $x_i \sim \chi^2(\nu_i)$ for $i=1, \ldots, n$. What is the distribution of $y = \sum_{i=1}^n x_i$?
        \item Let $z \sim N(0,1)$. What is the distribution of $y=z^2$?
        \item Let $x_1, \ldots, x_n$ be random sample from a $N(\mu,\sigma^2)$ distribution. What is the distribution of $y = \frac{1}{\sigma^2} \sum_{i=1}^n\left(x_i-\mu \right)^2$?
        \item Let $y=w_1+w_2$, where $w_1$ and $w_2$ are independent, $w_1\sim\chi^2(\nu_1)$ and $y\sim\chi^2(\nu_1+\nu_2)$. The parameters $\nu_1$ and $\nu_2$ are both positive. What is the distribution of $w_2$? 
    \end{enumerate}


\item Let $\mathbf{y} \sim N_p(\boldsymbol{\mu}, \boldsymbol{\Sigma})$, and $\mathbf{v} =  \boldsymbol{\Sigma}^{-\frac{1}{2}} (\mathbf{y}-\boldsymbol{\mu})$.
    \begin{enumerate}
        \item As an application of Problem~\ref{linear_trans}, what is the distribution of $\mathbf{v}$?
        \item Show $w = (\mathbf{y}-\boldsymbol{\mu})^\prime
                  \boldsymbol{\Sigma}^{-1}(\mathbf{y}-\boldsymbol{\mu}) \sim \chi^2 (p)$. This may be a bit easier than the way it was done in lecture.
    \end{enumerate}


\pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\item Let $x_1, \ldots, x_n \stackrel{i.i.d}{\sim} N(\mu,\sigma^2)$, and
$\mathbf{x} = \left( \begin{array}{c}
               x_1 \\ \vdots \\ x_n
              \end{array}\right)$. 
    \begin{enumerate}
        \item Let $\mathbf{y} =  
             \left( \begin{array}{c}
               x_1-\overline{x} \\ \vdots \\  x_n-\overline{x} \\\\ \overline{x}
             \end{array} \right)$. How do you know that $\mathbf{y}$ is multivariate normal, without doing any calculations?
        \item \label{covzero} Show $Cov\left(\overline{x},x_j-\overline{x}\right)=0$ This is a scalar calculation.
        \item Here is a matrix version of Question~\ref{covzero}. Following the text, let $\mathbf{j}$ denote a column vector of ones; in this case, $\mathbf{j}$ is $n \times 1$, and we can write $\overline{x} = \frac{1}{n}\mathbf{j}^\prime\mathbf{x}$. The task is to show 
$cov\left( \overline{x} \, , \, \mathbf{x} - \mathbf{j} \overline{x} \right) = \mathbf{0}$. A hint is to use Question 16 of Assignment 3.
        \item Why does the preceding result (or Question~\ref{covzero}) show that $\overline{x}$ is independent of 
        $\mathbf{y}_2 = \left( \begin{array}{c}
               x_1-\overline{x} \\ \vdots \\  x_n-\overline{x} 
             \end{array} \right)$?
        \item Why does the independence of $\overline{x}$ and $\mathbf{y}_2$ imply the independence of $\overline{x}$ and $s^2$? 
        \item  Show that $ \frac{(n-1)s^2}{\sigma^2} \sim \chi^2(n-1)$. Hint: $\sum_{i=1}^n\left(x_i-\mu \right)^2 = \sum_{i=1}^n\left(x_i-\overline{x} + \overline{x} - \mu \right)^2 = \ldots$. Where do you use the independence of $\overline{x}$ and $s^2$?
        \item Recall the definition of the $t$ distribution. If $z \sim N(0,1)$, $w \sim \chi^2(\nu)$ and $z$ and $w$ are independent, then $t = \frac{z}{\sqrt{w/\nu}}$ is said to have a $t$ distribution with $\nu$ degrees of freedom, and we write $t  \sim t(\nu)$. For random sampling from a normal distribution, show that $t = \frac{\sqrt{n}(\overline{x}-\mu)}{s} \sim t(n-1)$. Where do you use the independence of $\overline{x}$ and $s^2$?
    \end{enumerate} % End of UVN question

\end{enumerate} % End of all the questions

%\vspace{60mm}

\end{document}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


% \vspace{30mm} \hrule \vspace{30mm} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \vspace{30mm}\hrule\vspace{30mm}

% Removed

    \item \label{computer} The \texttt{statclass} data consist of Quiz
  average, Computer assignment average, Midterm score and Final Exam score from
  a statistics class, long ago. At the R prompt, type 

{\scriptsize
\begin{verbatim}
statclass = read.table("http://www.utstat.utoronto.ca/~brunner/data/legal/LittleStatclassdata.txt")
\end{verbatim}
} % End size
You now have access to the \texttt{statclass} data, just as you have access to the \texttt{trees} data set used in lecture, or any other R data set.
    \begin{enumerate}
        \item Calculate $\widehat{\boldsymbol{\beta}}$ two ways, with matrix commands and with the \texttt{lm} function. What is $\widehat{\beta}_2$? The answer is a number on your printout.
        \item What is the predicted Final Exam score for a student with a Quiz average of 8.5, a Computer average of 5, and a Midterm mark of 60\%? The answer is a number. Be able to do this kind of thing on the quiz with a calculator. My answer is 63.84144. 
        \item For any fixed Quiz Average and Computer Average, a score one point higher on the Midterm yields a predicted mark on the Final Exam that is \underline{\hspace{10mm}} higher.
        \item For any fixed Quiz Average and Midterm score, an average one point higher on the Midterm yields a predicted mark on the Final Exam that is \underline{\hspace{10mm}} higher. Or is it lower? 
    \end{enumerate}