% Sample Question document for STA312 \documentclass[12pt]{article} %\usepackage{amsbsy} % for \boldsymbol and \pmb %\usepackage{graphicx} % To include pdf files! \usepackage{amsmath} \usepackage{amsbsy} \usepackage{amsfonts} \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref} % For links \usepackage{fullpage} \usepackage{comment} %\pagestyle{empty} % No page numbers \begin{document} %\enlargethispage*{1000 pt} \begin{center} {\Large \textbf{Sample Questions: Maximum Likelihood Part 2}}%\footnote{} \vspace{1 mm} STA312 Fall 2023. Copyright information is at the end of the last page. \end{center} \vspace{5mm} \begin{enumerate} % Begin the questions \item Let $X_1, \ldots, X_n$ be independent $N(\mu,\sigma^2)$ random variables. \begin{enumerate} \item Derive formulas for the maximum likelihood estimates of $\mu$ and $\sigma^2$. We will establish that it's a maximum later. Show your work and \textbf{circle your final answer}. \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \item Calculate the Hessian of the minus log likelihood function: $\mathbf{H} = \left[\frac{\partial^2 (-\ell)} {\partial\theta_i\partial\theta_j}\right]$. Show your work. \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \item Give $\widehat{\mathbf{V}}_n$, the estimated asymptotic variance-covariance matrix of the MLE. Show some work. \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \item Consider a large-sample $Z$-test of $H_0:\mu=\mu_0$. Give an explicit formula for the test statistic. This is something you would be able to compute with a calculator given $\widehat{\mu}$ and $\widehat{\sigma}^2$. \vspace{100mm} \item Consider a large-sample $Z$-test of $H_0:\sigma^2=\sigma^2_0$. Give an explicit formula for the test statistic. This is something you would be able to compute with a calculator. \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \item Consider the large-sample likelihood ratio test of $H_0: \mu=\mu_0$. Derive an explicit formula for the test statistic $G^2$. Show your work and \emph{keep simplifying!}. \end{enumerate} % End of the first st of normal questions \pagebreak %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \item The file \href{http://www.utstat.toronto.edu/brunner/data/legal/normal.data.txt} {\texttt{http://www.utstat.toronto.edu/brunner/data/legal/normal.data.txt}} has a random sample from a normal distribution. \begin{enumerate} \item Find the maximum likelihood estimates of $\widehat{\mu}$ and $\widehat{\sigma}^2$ numerically. Compare the answer to your closed-form solution. \item Show that the minus log likelihood is indeed minimized at $(\widehat{\mu}, \widehat{\sigma}^2)$ for this data set. \item Calculate the estimated asymptotic covariance matrix of the MLEs. \item Give a ``better" estimated asymptotic covariance matrix based on your closed-form solution. \item Calculate a large-sample 95\% confidence interval for $\sigma^2$. \item Test $H_0: \mu = 103$ with a \begin{enumerate} \item $Z$-test. \item Likelihood ratio chi-squared test. Compare the closed-form version. \item Wald chi-squared test. \end{enumerate} Give the test statistic and the $p$-value for each test. \item The coefficient of variation (used in sample surveys and business statistics) is the standard deviation divided by the mean. \begin{enumerate} \item Show that multiplication by a positive constant does not affect the coefficient of variation. This is a paper and pencil calculation. \item Give a numerical point estimate of the coefficient of variation for the normal data of this question. Actually, it's the maximum likelihood estimate, because \emph{the invariance principle of maximum likelihood estimation says that the MLE of a function is that function of the MLE}. \item Using the delta method, give a 95\% confidence interval for the coefficient of variation. Start with a paper and pencil calculation of \.{g}$(\boldsymbol{\theta}) = \left( \frac{\partial g}{\partial\theta_1}, \ldots , \frac{\partial g}{\partial\theta_k} \right)$. \end{enumerate} \end{enumerate} % I had lifted the dead pixels problem: Q7 of 2101f18 A5. But it took us too far afield. Stay with normal. \begin{comment} \pagebreak \item Dead pixels are a big problem in manufacturing computer and cell phone screens. The physics of the manufacturing process dictates that dead pixels happen according to a spatial Poisson process, so that the numbers of dead pixels in cell phone screens are independent Poisson random variables with parameter $\lambda$, the expected number of dead pixels. Naturally, $\lambda$ depends on details of how the screens are manufactured. In an effort to reduce the expected number of dead pixels, six assembly lines were set up, each with a different version of the manufacturing process. A random sample of 50 phones was taken from each assembly line and sent to the lab for testing. Mysteriously, three phones from one assembly line disappeared in transit, and 15 phones from another assembly line disappeared. Sample sizes and sample mean numbers of dead pixels appear in the table below. \begin{verbatim} Manufacturing Process 1 2 3 4 5 6 ----------------------------------------- ybar 10.68 9.87234 9.56 8.52 10.48571 9.98 n 50 47 50 50 35 50 ----------------------------------------- \end{verbatim} The first task is to carry out a large sample likelihood ratio test to see whether the expected numbers of dead pixels are different for the six manufacturing processes. Using R, calculate the test statistic and the $p$-value. Also report the degrees of freedom. You are being asked for a computation, but \emph{most of the task is thinking and working things out on paper}. I got away with only five lines of code: One line to enter the means, one line to enter the sample sizes, one line to compute $G^2$, one line to compute the $p$-value, and one other line. Here are some little questions to get you started. \begin{enumerate} % \item Is this a between-cases design or a within-cases design? \item Denote the parameter vector by $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_p)^\top$. What is $p$? \item What is the null hypothesis? \item What is the distribution of a sum of independent Poisson random variables? \item What is the distribution of $n_j\overline{Y}_j$? \item What is the likelihood function? Write it down and simplify. \item What is the unrestricted MLE $\widehat{\boldsymbol{\lambda}}$? It's a vector. Work it out if you need to. \item What is the restricted MLE $\widehat{\boldsymbol{\lambda}}_0$? It's a vector. Work it out if you need to. \item Now you are ready to write the test statistic. There are a lot of cancellations. Keep simplifying! \item Now use R to compute the test statistic and $p$-value. % For comparison, my $p$-value is $0.01169133$. \end{enumerate} \end{comment} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Bayes \end{enumerate} % End of all the questions \vspace{20mm} \noindent \begin{center}\begin{tabular}{l} \hspace{6in} \\ \hline \end{tabular}\end{center} This document was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Mathematical and Computational Sciences, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \begin{center} \href{http://www.utstat.toronto.edu/brunner/oldclass/312f23} {\small\texttt{http://www.utstat.toronto.edu/brunner/oldclass/312f23}} \end{center} \end{document} % If I had not skipped the delta method, ... \item The coefficient of variation (used in sample surveys and business statistics) is the standard deviation divided by the mean. \begin{enumerate} \item Show that multiplication by a positive constant does not affect the coefficient of variation. This is a paper and pencil calculation. \item Give a numerical point estimate of the coefficient of variation for the normal data of this question. Actually, it's the maximum likelihood estimate, because \emph{the invariance principle of maximum likelihood estimation says that the MLE of a function is that function of the MLE}. \item Using the delta method, give a 95\% confidence interval for the coefficient of variation. Start with a paper and pencil calculation of \.{g}$(\boldsymbol{\theta}) = \left( \frac{\partial g}{\partial\theta_1}, \ldots , \frac{\partial g}{\partial\theta_k} \right)$. \end{enumerate}