% \documentclass[serif]{beamer} % Serif for Computer Modern math font. \documentclass[serif, handout]{beamer} % Handout mode to ignore pause statements \hypersetup{colorlinks,linkcolor=,urlcolor=red} \usefonttheme{serif} % Looks like Computer Modern for non-math text -- nice! \setbeamertemplate{navigation symbols}{} % Supress navigation symbols \usetheme{AnnArbor} % CambridgeUS Blue and yellow, Shows current section title % \usetheme{Berlin} % Displays sections on top \usepackage[english]{babel} % \definecolor{links}{HTML}{2A1B81} % \definecolor{links}{red} \setbeamertemplate{footline}[frame number] \mode % \mode{\setbeamercolor{background canvas}{bg=black!5}} \title{Extensions of the Proportional Hazards Model\footnote{See last slide for copyright information.}} \subtitle{STA312 Spring 2019} \date{} % To suppress date \begin{document} \begin{frame} \titlepage \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Background Reading} %\framesubtitle{} \begin{itemize} \item Section 8.2 in Chapter 8, and Chapter 9 in \emph{Applied Survival Analysis Using R} by Dirk Moore \item[] \item \emph{Modeling Survival Data: Extending the Cox Model} (2000) by Terry Thereau and Patricia Grambsch \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Overview} \tableofcontents \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Stratification} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Stratification} \pause %\framesubtitle{} \begin{itemize} \item \emph{Strata} are levels, or layers, like a cake. \pause \item Think of a stratum as a sub-population. \pause \item We often consider an independent random sample from each stratum. \pause \item For example, companies in Canada, the U.S.~and Mexico. \pause \item For proportional hazards regression, it may not make sense to assume that the baseline hazard functions are the same in all the strata. \pause \item Multi-center clinical trials, with different patient populations in each medical center. \pause \item Assume a separate baseline hazard function in each stratum. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Partial Likelihood Function for a Stratified Model} \framesubtitle{There are $k$ strata} \pause {\LARGE \begin{displaymath} \mbox{PL}(\boldsymbol{\beta}) = \prod_{\ell=1}^k \left( \prod_{i=1}^D \frac{\displaystyle e^{\mathbf{x}_{i,\ell}^\top \boldsymbol{\beta}} } {\displaystyle \sum_{j \in R_{i,\ell}} e^{\mathbf{x}_{j,\ell}^\top \boldsymbol{\beta}}} \right) \end{displaymath} \pause } % End size \begin{itemize} \item Separate baseline hazards are cancelling within the parentheses. \pause \item Note that the parameter vector $ \boldsymbol{\beta}$ is the same in all strata. \pause \item This condition can be relaxed. \pause \item And tested with a partial likelihood ratio test. \pause \item But there is no direct test for differences between strata. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame}[fragile] \frametitle{Sample Code for Stratification} %\framesubtitle{} \begin{verbatim} coxph(Surv(time, delta) ~ age + strata(ascites) + bili + protime + albumin) \end{verbatim} \end{frame} % bili is serum bilirunbin, a blood test. albumin is another blood test. protime is standardized blood clotting time. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Time Dependent Coefficients} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Time Dependent Coefficients} \pause %\framesubtitle{} \begin{itemize} \item The regression coefficients $\beta_j$ might depend on time\pause: $\beta_j(t)$. \pause \begin{displaymath} h(t) = h_0(t) \exp\{ \mathbf{x}^\top\boldsymbol{\beta}(t)\} \end{displaymath} \pause \item This is attractive, but maximum likelihood estimation of the function \pause (actually, $p$ functions) \pause would require lots of failures at every possible time point. \pause \item Solution: Estimate the function another way, and then put the estimate into the partial likelihood. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Schoenfeld Residuals} \pause %\framesubtitle{} For each $0 < t_1, < \ldots < t_k < \ldots < t_D$ \pause \begin{itemize} \item $\mathbf{s}_k$ is the vector of $p$ Schoenfeld residuals at time $k$. \pause \item $s_{k,j}$ is the Schoenfeld residual for variable $j$ at time $k$. \pause \item $\mathbf{s}^*_k$ are the scaled Schoenfeld residuals. \pause \item Grambsch and Thereau have shown $E(s^*_{k,j} + \widehat{\beta}_j) \approx \beta_j(t_k)$. \pause \item They suggest using $s^*_{k,j} + \widehat{\beta}_j$ to estimate $\beta_j(t)$ at $t_k$.\pause \item If a plot of the Schoenfeld residuals against time looks constant, no problem. \pause \item If the plot shows a trend, it suggests $\beta_j$ is a function of time. \pause \item And the proportional hazards assumption is wrong. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Testing proportional hazards using Schoenfeld residuals} \pause %\framesubtitle{} \begin{itemize} \item Have a plot of the Schoenfeld residuals against time. \pause \item Test whether the correlation equals zero. \pause \item Transformations of the $t$ axis (scaling) allow curves. \pause \item For example, check correlation of the residuals against $\log(t)$. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Estimation for a fixed $\beta_j(t)$} \framesubtitle{Using partial likelihood} \pause \begin{itemize} \item $\beta_j(t)$ is assumed ``known," but usually it's a guess based on residual plots. \pause \item For simplicity, consider a single explanatory variable. \pause \item Original model: $h(t_i) = h_0(t_i) \exp\{ \beta x_i \}$. \pause \item Create a time-varying covariate that just equals time, or a function of time $g(t)$ like $\log(t)$. \pause \item Replace $x_i$ by the ``interaction term" $x_i \, g(t_i)$. \pause \item Model for the hazard is now $h(t_i) = h_0(t_i) \exp\{ \beta g(t_i) \, x_i \}$. \pause \item The function $\beta(t) \pause = \beta g(t_i)$. \pause \item The $\beta$ part is unknown, and is estimated as usual by maximum partial likelihood. \pause \item So really you are assuming that the form of $\beta(t)$ is known, but only up to multiplication by a constant. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame}[fragile] \frametitle{Sample Code for Time-Dependent Coefficients} %\framesubtitle{} \begin{verbatim} coxph( Surv(time,status) ~ celltype + tt(karno), tt = function(x,t, ...){x*log(t)} ) \end{verbatim} \pause \begin{verbatim} loginter = function(x,t,...) {x*log(t)} coxph(Surv(time,status) ~ celltype + tt(karno), tt = loginter) \end{verbatim} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Frailty Models} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Frailty Models} \framesubtitle{Within-cases, Random effects} \pause \begin{itemize} \item A single unit may contribute more than one event\pause, like several seizures. \pause \item Randomly assign one eye to experimental condition, one to control. \pause Response variable is time to blindness. \pause \item Some groups of patients are surely not independent\pause, like several female relatives of a breast cancer patient. \pause \item The reason for the term ``frailty" \pause is the idea that individuals (and units) have a characteristic that is their own relative chance of failure. \pause \item Frail means weak -- more likely to die. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{The Frailty Model} \framesubtitle{Random effects} \pause The hazard at time $j$ for cluster $i$ \pause is $h_{i,j}(t_{i,j}) = h_0(t_{i,j}) \, \omega_i \, \exp\{ \mathbf{x}_{i,j}^\top\boldsymbol{\beta} \}$. \pause \begin{itemize} \item $\omega_i>0$ is a \emph{random effect}. \pause \item The clusters (individuals, families, whatever) are randomly sampled from some population\pause, and the hazard is multiplied by the same quantity $\omega_i$ for every member of cluster $i$. \pause \item If $\omega_i=2$, it means every member of cluster $i$ is quite frail. \pause Their hazards are all multiplied by 2. \pause \item Think of it as a ``random shock." \pause \item Shock is random because clusters are assumed to be randomly sampled from some population. \pause \item So $\omega_i>0$ comes from some (assumed) probability distribution. \pause \item Gamma and log-normal are typical choices. \pause \item For log-normal(0,$\sigma^2$)\pause, the parameter vector is $(\boldsymbol{\beta}, \sigma^2)$. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Log-Normal Random Effects} \framesubtitle{Instead of writing $h_{i,j}(t_{i,j}) = h_0(t_{i,j}) \, \omega_i \, \exp\{ \mathbf{x}_{i,j}^\top\boldsymbol{\beta} \}$} \pause Another way to write the hazard is \pause {\LARGE \begin{displaymath} h_{i,j}(t_{i,j}) = h_0(t_{i,j}) \exp\{ \sigma z_i + \mathbf{x}_{i,j}^\top\boldsymbol{\beta} \}, \end{displaymath} \pause } % End size where $z_i$ is standard normal. \pause \begin{itemize} \item $\sigma$ is like another regression coefficient. \pause \item Interpretation: \pause If the random effect is one standard deviation above the mean (so $z_i=1$)\pause, then the hazard is multiplied by $e^\sigma$. \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame}[fragile] \frametitle{Sample Code for Frailty Models} \framesubtitle{\texttt{me} stands for mixed effects} \pause \begin{verbatim} install.packages("coxme",dependencies=TRUE) # Only need to do this once library(coxme) \end{verbatim} \pause \begin{verbatim} (Surv(age, brcancer) ~ mutant + (1|famID), data=ashkenazi) \end{verbatim} \pause \begin{verbatim} coxph(Surv(y, uncens) ~ trt) # Just treatment \end{verbatim} \pause \begin{verbatim} # Add random effect for medical center coxme(Surv(y, uncens) ~ trt + (1|center)) \end{verbatim} \pause \begin{verbatim} # Random effect of treatment nested within medical center coxme(Surv(y, uncens) ~ trt + (1 | center/trt)) \end{verbatim} \pause Rich specification of mixed models as in \texttt{lmer}. \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Competing Risks} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame} \frametitle{Copyright Information} This slide show was prepared by \href{http://www.utstat.toronto.edu/~brunner}{Jerry Brunner}, Department of Statistics, University of Toronto. It is licensed under a \href{http://creativecommons.org/licenses/by-sa/3.0/deed.en_US} {Creative Commons Attribution - ShareAlike 3.0 Unported License}. Use any part of it as you like and share the result freely. The \LaTeX~source code is available from the course website: \href{http://www.utstat.toronto.edu/~brunner/oldclass/312s19} {\footnotesize \texttt{http://www.utstat.toronto.edu/$^\sim$brunner/oldclass/312s19}} \end{frame} \end{document} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % $i = 1, \ldots, D$, the uncensored observations # veteran -- see A10, and also TG p. 135 rm(list=ls()); options(scipen=999) library(survival) help(veteran) summary(veteran) # Edit data frame veteran = within(veteran, { trt = trt - 1 # Makes it indicator for test treatment; standard is reference. # prior = prior/10 }) # Experimental treatment, cell type and Karnofsky score mod0 = coxph(Surv(time,status) ~ celltype + karno, data=veteran); mod0 sr0 = cox.zph(mod0, transform='log'); sr0 plot(sr0[4]) # Karnofsky score mod1 = coxph(Surv(time,status) ~ celltype + tt(karno), tt = function(x,t,...){x*log(t)} , data=veteran); mod sr1 = cox.zph(mod1); sr1; plot(sr1[4]) zph1 = cox.zph(mod0, transform='identity'); zph1; plot(zph1[4]) # I like this one. zph2 = cox.zph(mod0, transform='rank'); zph2; plot(zph2[4]) zph3 = cox.zph(mod0, transform='km'); zph3; plot(zph3[4]) # Default zph4 = cox.zph(mod0, transform='log'); zph4; plot(zph4[4]) # TnG like this one # They did not work to fit beta(t), and neiher did I. # Experiment mod = coxph(Surv(time,status) ~ celltype + tt(karno), tt = function(x,t, ...){x*log(t)} , data=veteran); mod loginter = function(x,t,...) {x*log(t)} coxph(Surv(time,status) ~ celltype + tt(karno), tt = loginter , data=veteran) D. Oakes. Frailty models for multiple event times. In J. P. Klein and P. K. Goel, editors, Survival Analysis, State of the Art. Kluwer, Netherlands, 1992.