**************************
* Biostatistics 513 * Exercise Set 7, 2002
***************************
1(a) I used STATA for this since it makes the 95% CI's for us automatically:
Here are the values around 3 years, and 6 years:
Beg. Net Survivor Std.
Time Total Fail Lost Function Error [95% Conf. Int.]
-------------------------------------------------------------------------------
stage34=0
2.5 43 0 1 0.8795 0.0461 0.7513 0.9440
2.6 42 0 1 0.8795 0.0461 0.7513 0.9440
3.2 41 1 1 0.8581 0.0497 0.7250 0.9297
5.5 22 0 1 0.6415 0.0731 0.4798 0.7646
5.9 21 0 2 0.6415 0.0731 0.4798 0.7646
6 19 1 0 0.6077 0.0767 0.4411 0.7385
6.1 18 0 1 0.6077 0.0767 0.4411 0.7385
stage34=1
2.3 21 1 0 0.5000 0.0791 0.3383 0.6419
2.9 20 0 1 0.5000 0.0791 0.3383 0.6419
3.2 19 1 0 0.4737 0.0792 0.3140 0.6175
5 10 1 1 0.3537 0.0797 0.2040 0.5068
5.1 8 0 1 0.3537 0.0797 0.2040 0.5068
6.3 7 1 0 0.3032 0.0828 0.1543 0.4666
From this we can obtain the following:
stage34 = 0 stage34 = 1
S(3yr) 0.8795 (0.7513, 0.9440) 0.5000 (0.3383, 0.6419)
S(6yr) 0.6077 (0.4411, 0.7385) 0.3537 (0.2040, 0.5068)
This implies that there is a statistically significant difference in the
3 year survival for stage34=1 compared to stage34=0. However, the
confidence intervals for 6 year survival for these groups overlap
indicating that although the estimated 6 year survival is lower for
stage34=1, only 35% survival beyond 6 years, this is not significantly
different from the 6 year survival for stage34=0, 61%.
1(b) Here are the log-rank and Wilcoxon (Breslow) tests:
Log-rank test for equality of survivor functions
------------------------------------------------
| Events
stage34 | observed expected
--------+-------------------------
0 | 22 32.58
1 | 28 17.42
--------+-------------------------
Total | 50 50.00
chi2(1) = 10.13
Pr>chi2 = 0.0015
Wilcoxon (Breslow) test for equality of survivor functions
----------------------------------------------------------
| Events Sum of
stage34 | observed expected ranks
--------+--------------------------------------
0 | 22 32.58 -802
1 | 28 17.42 802
--------+--------------------------------------
Total | 50 50.00 0
chi2(1) = 14.06
Pr>chi2 = 0.0002
Each of these tests rejects the null hypothesis that the two groups
have equal survival curves. We find that the Wilcoxon test statistic
is larger, which is expected since we saw a large difference in the
survival curves for early times, and this statistic places more weight
on the comparison for early times.
1(c) Here are the log-rank and the Wilcoxon (Breslow) tests for stage:
Log-rank test for equality of survivor functions
------------------------------------------------
| Events
stage | observed expected
------+-------------------------
1 | 15 22.57
2 | 7 10.01
3 | 17 14.08
4 | 11 3.34
------+-------------------------
Total | 50 50.00
chi2(3) = 22.76
Pr>chi2 = 0.0000
Wilcoxon (Breslow) test for equality of survivor functions
----------------------------------------------------------
| Events Sum of
stage | observed expected ranks
------+--------------------------------------
1 | 15 22.57 -579
2 | 7 10.01 -223
3 | 17 14.08 244
4 | 11 3.34 558
------+--------------------------------------
Total | 50 50.00 0
chi2(3) = 23.18
Pr>chi2 = 0.0000
Again we reject the null hypothesis that the groups have the same survival
functions. Here the test statistics are quite similar, which is consistent
with the KM curves that show clear differences between these groups both
ealy in time and at later times.
1(d) Cox regression with dummy variables for stage:
LR chi2(3) = 16.26
Log likelihood = -189.08124 Prob > chi2 = 0.0010
------------------------------------------------------------------------------
_t |
_d | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
Istage_2 | 1.067972 .489604 0.143 0.886 .4348436 2.622932
Istage_3 | 1.844227 .655076 1.723 0.085 .9193153 3.69968
Istage_4 | 5.600403 2.350266 4.105 0.000 2.46039 12.74778
------------------------------------------------------------------------------
This regression estimates the hazard for stage=2 to be 1.068 times the hazard
for stage=1 patients. This small increase is not significant as the confidence
interval for this comparison contains 1.0 (95% CI 0.435, 2.623). The
estimated hazard ratio comparing stage=3 to stage=1 is 1.844 with a 95%
confidence interval 0.919, 3.700. Thus although there is an increased risk
of death for stage=3 patients relative to stage=1 patients, this increase
is not significant at the nominal 5% level. Finally, the hazard ratio
comparing stage=4 to stage=1 patients is 5.600 which is statistically
significant with a 95% CI (2.460, 12.748).
These summaries are in agreement with what the KM plots show: stage 1 and
stage 2 are similar; and stage 3 has poorer survival rates while stage 4
patients fare worst of all.
1(e) Here is a Cox regression that uses stage=1,2,3,4 as the predictor
variable in a linear log hazard model:
LR chi2(1) = 13.10
Log likelihood = -190.66133 Prob > chi2 = 0.0003
------------------------------------------------------------------------------
_t |
_d | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
stage | 1.657713 .2338867 3.582 0.000 1.257226 2.185774
------------------------------------------------------------------------------
In this model we estimate the following comparisons:
stage 2 versus stage 1: hazard ratio of 1.658
stage 3 versus stage 1: hazard ratio of 1.658*1.658 = 2.749
stage 4 versus stage 1: hazard ratio of 1.658*1.658*1.658 = 4.558
This model estimates a higher relative risk for stage 2 and stage 3 patients
compared to the dummy variable (unrestricted) model. Also, stage 4 is
estimated to have somewhat lower relative risk.
This model assumes that a 1 unit difference in stage results in an exp(b)=1.658
increase in the hazard of death. This assumes that 2 versus 1, 3 versus 2, and
4 versus 3, all have the same hazard ratio.
1(f) A likelihood ratio test computes:
LR test = 2*( -189.08 - -190.66 ) = 3.16
This test has 2 degrees of freedom ( 3 parameters for the dummy variable model
versus 1 parameter for the linear model).
The 5% critical value for a chi-square(2) is 5.99. Therefore, since the test
statistic is not larger than this critical value we would not reject the null
hypothesis and conclude that there is not enough evidence to reject the
linear model.
Note: I might have reservations using this linear model since the fitted
hazards differ systematically from the dummy variable model. However,
we do not have enough evidence to reject the linear model in favor of the
dummy variable model. This says that the variation we see (away from the
linear model) is possibly due to chance.
1(g) SEE the web page for the plots.
2(a) None of the p(PH) values indicate a violation of the PH assumption.
2(b) Use:
P = platelet
A = age
S = sex
Based on model 1 we have:
h( t, X ) = h0(t) exp( 0.470*P +0.000*A + 0.183*S -0.008*P*A
-0.503*P*S )
The hazard ratios are obtained from the Cox regression part (the hazard
divided by the baseline hazard, or just the "exp" part above).
2(c) Compare (P=1, A=40, S=0) to (P=0, A=40, S=0):
exp( 0.470*(1) + 0.000*(40) +0.183*(0) -0.008*(1)*(40) -0.503*(1)*(0) )
-----------------------------------------------------------------------
exp( 0.470*(0) + 0.000*(40) +0.183*(0) -0.008*(0)*(40) -0.503*(0)*(0) )
= exp( 0.15 )
-----------
exp( 0.00 )
= 1.162
Compare (P=1, A=50, S=1) to (P=0, A=50, S=1):
exp( 0.470*(1) + 0.000*(50) +0.183*(1) -0.008*(1)*(50) -0.503*(1)*(1) )
-----------------------------------------------------------------------
exp( 0.470*(0) + 0.000*(50) +0.183*(1) -0.008*(0)*(50) -0.503*(0)*(1) )
= exp(-0.250 )
-----------
exp( 0.183 )
= 0.649
2(d) We can execute a likelihood ratio test comparing model 2 to model 1:
terms # parameters -2*log L
---------------------------------------------------------------------------
model 1: P + A + S + P*A + P*S 5 306.080
model 2: P + A + S 2 306.505
LR test = 306.505 - 306.080 = 0.425
Comparing this test statistic to a chi-square(df=2) yields a non-significant
p-value (clearly since the critical value for chi-square(2) is 5.99,
or if you make the p-value calculation you obtain p=0.809).
Therefore, we conclude that we fail to reject the null hypothesis that
the coefficients of P*A and P*S are zero.
2(e) We can summarize these models as follows:
terms platelet=1 vs platelet=0 HR
---------------------------------------------------------------------------
model 5: P exp(-0.694) = 0.500 (1/0.500 = 2.000)
model 4: P + S exp(-0.705) = 0.494 (1/0.494 = 2.024)
model 3: P + A exp(-0.706) = 0.494 (1/0.494 = 2.024)
model 2: P + A + S exp(-0.725) = 0.484 (1/0.484 = 2.066)
Since the model that adjusts for A and S yields an estimated hazard ratio
(HR) of 0.484, and this is not meaningfully different from the unadjusted
estimate (HR=0.500), we conclude that it is not necessary to adjust for
A and S confounding. However, in presenting these results we would likely
choose the model that does the adjustment so that readers can see that
we have accounted for these (potentially important) variables.