**************** **************** ** stphplot ** Evaluating the proportional hazards assumption **************** **************** Overview: "stphplot" plots -ln(-ln(survival)) curves for each category of a nominal or ordinal independent variable versus ln(time). These are often referred to as "log-log" plots. Optionally, these estimates can be adjusted for covariates. The proportional hazards assumption is not violated when the curves are parallel. Usage: "stset ftime, failure(status)" "stphplot, by(Svar)" "stphplot, strata(Svar) adjust(x1 x2)" Where - ftime is the follow-up time dvar is the censoring/death indicator (dvar=1 if the event time was observed and 0 if censored) Where - "Svar" is the covariate of interest, and we then want to adjust for x1 and x2. Summaries: After fitting KM curves for each level of Svar the log-log plot is created. Options: (1) The "by(Svar)" is required unless "strata" is specified. (2) "adjust" is used to consider the PH assumption after adjusting for other covariates. In general, if you want to adjust then you should use the "strata" option rather than the "by" option for the covariate of interest. **************** **************** ** stphtest ** Evaluating the proportional hazards assumption **************** **************** Overview: This command can compute a global test, tests for each covariate, and plot the Schoenfeld residuals. Usage: "stset ftime, failure(status)" "stcox x1 x2 x3, scaledsch(resid0*) schoenfeld(resid1*)" "stphtest" "stphtest, detail" "stphtest, plot( x1 )" Note that the option "scaledsch(resid0*)" is required in order to create the scaled schoenfeld residuals required for stphtest. The addition of "schoenfeld" is required for a global test of any PH violations. Summaries: STATA will create the test statistics and report p-values for testing the null hypothsis that the PH assumption holds for all variables (the global), and the test for each individual covariate. Options: (1) The "plot( x1 )" option is in order to view a plot of the residuals versus time. **************** **************** ** stcox ** Stratified Cox Regression **************** **************** Overview: To model survival as a function of covariates the Cox model may be used. This model focuses on estimates of hazard ratios. Usage: "stset ftime, failure(status)" "stcox tx clinic" "stcox tx, strata(clinic)" Where - ftime is the follow-up time dvar is the censoring/death indicator (dvar=1 if the event time was observed and 0 if censored) Where - "tx clinic" are the covariates But rather than assume PH for clinic, the "strata" option allows a separate baseline hazard for every level of clinic -- ONLY use for discrete covariates that take a small number of values. Summaries: STATA will return estimates of the hazard ratios (exponentiated coefficients), standard errors, and confidence intervals. Options: (1) "stcox tx wbc, nohr" -- the "nohr" option means that regression coefficients (log hazard ratios) will be reported rather than hazard ratios. (2) "stcox tx wbc, basesurv( s0 )" -- this option saves the estimate of the baseline survival as the variable "s0". NOTE: if you are saving s0, then this is for X=0 (all of them). So, you might want to center your predictors so that the value X=0 is meaningful. (3) "stcox tx wbc" "lrtest, saving(1)" -- this option is used similar to logistic regression for computing likelihood ratio tests. (4) "by Svar: cox ftime tx, dead(status)" -- this means a separate cox regression is fitted for each level of "Svar". Therefore we obtain a hazard ratio for "tx" for each level of "Svar". The data need to be sorted (ie. use "sort Svar" before this command). Note also that to use "by" we need to use "cox" rather than "stcox" and this requires us to give the time argument (ftime above) and the status indicator using "dead(status)". (5) "stcox tx, strata(Svar)" -- this means a separate baseline hazard is assumed for every level of "Svar" but hazard ratios within these strata are assumed to be equal (common). A single "tx" HR is estimated. Comments on log-log plots: -------------------------- 1. If you wanted to look at treatment adjusted for WBC you would use: stphplot, strata(tx) adjust( newlwbc ) Here there is a subtle difference between the use of "by(tx)" and the use of "strata(tx)" -- essentially, use of by leads to a separate Cox regression for each level of tx to adjust for newlwbc, while with strata a single Cox model is used for adjustment (and that's my preference here). 2. For the adjusted log-minus-log plots in (1) the plots are shown for the average value of each adjustment variable. Alternatively, if you wanted to look at the adjusted log-minus-log plots for the value X=0 for each adjustment variable then you could use: stphplot, strata(tx) adjust( newlwbc ) zero Comments on residual plots: --------------------------- 1. Scaled Schoenfeld residuals are slightly different (and better!) than described in lecture. They are multiplied by a certain matrix (called the information matrix) which then makes the scale of the residuals the same as the scale for their corresponding beta. In the end, if we were thinking about looking for the covariate comparison as a function of time, then we are thinking about beta(t) rather than a simple PH beta. By scaling, the scatterplot level is showing us beta(t) -- it won't be at zero like other residual plots, but rather is showing how the value of the regression coefficient varies over time. 2. Key violation of PH is a decreasing hazard over time. We detect this by seeing a decreasing trend in the residual plot (or increasing for a negative logHR).