R Code for Correlates of Risk Assessment in Moodie et al. (2017, JID, Table 1)

September 24???26, 2018

CoR analysis by quantitative average titer

# 'data' includes all vaccine recipients in the month 13 at-risk cohort in CYD14
dataI <- subset(data, !is.na(IMPSTLOG.AUCMB))

library(survival)
fit <- cch(Surv(oftime_m13, ofstatus_m13) ~ IMPSTLOG.AUCMB + MALE + AGE + COUNTRY,
           subcoh = dataI$FASI=="Y",
           id = dataI$SUBJID,
           cohort.size = NROW(data),
           data = dataI,
           method = "LinYing")

CoR analysis by quantitative average titer

Estimators for unstratified case-cohort data:

method = "Prentice": Prentice (1986, Biometrika) (default)
method = "SelfPrentice": Self and Prentice (1988, Annals of Statistics)
method = "LinYing": Lin and Ying (1993, JASA)
- generally more efficient than the Prentice estimator

Estimators for stratified case-cohort data (Borgan et al., 2000, LIDA):

method = "I.Borgan" a generalization of the Self-Prentice estimator
method = "II.Borgan" a generalization of the Lin-Ying estimator
- allows stratification by study in the CYD14+CYD15 pooled analysis
  3 methods available for unstratified case-cohort data: Prentice (1986), Self and Prentice (1988), Lin and Ying (1993). The methods differ in the choice of risk sets they use to compare covariate values of the failure with covariate values of others at risk at the time of the failure. "Prentice" uses the sub-cohort members "at risk" plus the failure if that occurs outside the sub-cohort and is score unbiased. "SelfPren" (Self-Prentice) uses just the sub-cohort members "at risk". These two have the same asymptotic variance-covariance matrix. "LinYing" (Lin-Ying) uses the all members of the sub-cohort and all failures outside the sub-cohort who are "at risk". The Lin-Ying estimator has a better variance than the Prentice estimator. The Lin-Ying (1993, JASA) method is best suited to the sampling design of CYD14 and CYD15 where the method uses data from all members of the immuno subset and all cases outside the immuno subset who are "at risk" (since we have Mon 13 titers on cases outset the immuno subset). For the combined CYD14 and CYD15 analysis, we used the Borgan.II method- a generalization of the Lin-Ying estimator that allows stratification (by study). 2 methods available for stratified case-cohort data: Borgan I, a generalization of the Self-Prentice estimator, and Borgan II, a generalization of the Lin-Ying estimator. The choice of method depends on the sampling design- eg Borgan II for stratified outcome-dependent retrospective two-phase sampling vs Borgan I for case-cohort sampling where the subcohort is defined at baseline. As the Lin-Ying method applies when there is phase 2 data on the simple random sample defined at baseline (i.e., the immunogenicity subset) and on all post-Month 13 cases outside the immuno subset, we defined the sub cohort as the non-cases in the immuno subset as all of the post-Month 13 cases are selected with probability 1. If the sampling design includes only phase 2 data on the simple random sample defined at baseline, then subcoh would be defined as FASI==Y. Our cch results and the tps results were nearly identical.

CoR analysis by quantitative average titer

fit <- cch(Surv(oftime_m13, ofstatus_m13) ~ IMPSTLOG.AUCMB + MALE + AGE + COUNTRY,
           subcoh = dataI$FASI=="Y",
           id = dataI$SUBJID,
           cohort.size = NROW(data),
           data = dataI,
           method = "LinYing")

subcoh: indicates subjects sampled into the subcohort
id: vector of unique identifiers
cohort.size: size of the entire cohort from which the subcohort was sampled
data: optional data frame for interpreting variables in the formula
method: "Prentice", "SelfPrentice", "LinYing", "I.Borgan", "II.Borgan"
The estimator does not distinguish between cases in the sampled subcohort and cases outside the subcohort.

CoR analysis by quantitative average titer

fit

## Case-cohort analysis,x$method, LinYing 
##  with subcohort of 1295 from cohort of 6639 
## 
## Call: cch(formula = Surv(oftime_m13, ofstatus_m13) ~ IMPSTLOG.AUCMB + 
##     MALE + AGE + COUNTRY, data = dataI, subcoh = dataI$FASI == 
##     "Y", id = dataI$SUBJID, cohort.size = NROW(data), method = "LinYing")
## 
## Coefficients:
##                     Value        SE        Z            p
## IMPSTLOG.AUCMB -1.3864102 0.2119061 6.542569 6.047096e-11
## MALE           -0.2284395 0.2074144 1.101368 2.707367e-01
## AGE>11         -1.1227027 0.3627712 3.094796 1.969486e-03
## AGE>5,<=11      0.3557094 0.2173679 1.636439 1.017477e-01
## COUNTRYMYS     -1.7884327 0.5734435 3.118760 1.816137e-03
## COUNTRYPHL      0.6950063 0.2922319 2.378270 1.739409e-02
## COUNTRYTHA     -0.5217174 0.3690726 1.413590 1.574823e-01
## COUNTRYVNM     -1.8075696 0.4944307 3.655860 2.563208e-04

CoR analysis by quantitative average titer

sfit <- summary(fit)$coef
sfit

##                     Value        SE        Z            p
## IMPSTLOG.AUCMB -1.3864102 0.2119061 6.542569 6.047096e-11
## MALE           -0.2284395 0.2074144 1.101368 2.707367e-01
## AGE>11         -1.1227027 0.3627712 3.094796 1.969486e-03
## AGE>5,<=11      0.3557094 0.2173679 1.636439 1.017477e-01
## COUNTRYMYS     -1.7884327 0.5734435 3.118760 1.816137e-03
## COUNTRYPHL      0.6950063 0.2922319 2.378270 1.739409e-02
## COUNTRYTHA     -0.5217174 0.3690726 1.413590 1.574823e-01
## COUNTRYVNM     -1.8075696 0.4944307 3.655860 2.563208e-04

CoR analysis by quantitative average titer

# estimated hazard ratio per 10-fold titer increase
exp(sfit[1,1])

## [1] 0.249971

# Wald 95% CI for the hazard ratio
exp(sfit[1,1] + c(-1,1) * qnorm(0.975) * sfit[1,2])

## [1] 0.1650118 0.3786731

# p-value from 2-sided Wald test of H0: HR = 1
sfit[1,4]

## [1] 6.047096e-11

CoR analysis by quantitative serotype-specific titer

# 'data' includes all vaccine recipients in the month 13 at-risk cohort in CYD14
dataI <- subset(data, !is.na(IMPSTLOG.Sero1))
fit <- cch(Surv(s1ftime_m13, s1fstatus_m13) ~ IMPSTLOG.Sero1 + MALE + AGE + COUNTRY,
           subcoh = with(dataI, s1fstatus_m13==0 | (s1fstatus_m13==1 & FASI=="Y")),
           id = dataI$SUBJID,
           cohort.size = NROW(data),
           data = dataI,
           method = "LinYing")

dataI <- subset(data, !is.na(IMPSTLOG.Sero2))
fit <- cch(Surv(s2ftime_m13, s2fstatus_m13) ~ IMPSTLOG.Sero2 + MALE + AGE + COUNTRY,
           subcoh = with(dataI, s2fstatus_m13==0 | (s2fstatus_m13==1 & FASI=="Y")),
           id = dataI$SUBJID,
           cohort.size = NROW(data),
           data = dataI,
           method = "LinYing")

# etc.

CoR analysis by low, medium, high average titer

Low, medium, and high titer subgroups defined by weighted tertiles of month 13 titers pooling over DENV1–4 and over the vaccine and placebo groups within each trial.

Why weighted? To account for the case-cohort biomarker sampling design.

# here 'data' includes all vaccine and placebo recipients in the month 13 at-risk cohort in CYD14

wCase <- NROW(subset(data, ofstatus_m13==1)) / 
  NROW(subset(data, ofstatus_m13==1 & !is.na(IMPSTLOG.AUCMB)))
wCase

## [1] 1.004098

wControl <- NROW(subset(data, ofstatus_m13==0)) / 
  NROW(subset(data, ofstatus_m13==0 & !is.na(IMPSTLOG.AUCMB)))
wControl

## [1] 5.116551

CoR analysis by low, medium, high average titer

# pool over DENV1-4
IMPSTLOG <- with(data, c(IMPSTLOG.Sero1, IMPSTLOG.Sero2, IMPSTLOG.Sero3, IMPSTLOG.Sero4))
wts <- rep(ifelse(data$ofstatus_m13==1, wCase, wControl), 4)

library(Hmisc)
q <- wtd.quantile(IMPSTLOG, weights = wts, probs = c(1/3, 2/3))
# weighted tertiles defining low, medium, high titer subgroups in CYD14
10^q

##    33.33%    66.67% 
##  58.00027 265.99900

CoR analysis by low, medium, high average titer

data$cat.Sero1 <- cut(data$IMPSTLOG.Sero1, breaks=c(0, q[1], q[2], Inf), 
                      right=FALSE, labels=c("Low","Medium","High"))
data$cat.Sero2 <- cut(data$IMPSTLOG.Sero2, breaks=c(0, q[1], q[2], Inf), 
                      right=FALSE, labels=c("Low","Medium","High"))
data$cat.Sero3 <- cut(data$IMPSTLOG.Sero3, breaks=c(0, q[1], q[2], Inf), 
                      right=FALSE, labels=c("Low","Medium","High"))
data$cat.Sero4 <- cut(data$IMPSTLOG.Sero4, breaks=c(0, q[1], q[2], Inf), 
                      right=FALSE, labels=c("Low","Medium","High"))
data$cat.AUCMB <- cut(data$IMPSTLOG.AUCMB, breaks=c(0, q[1], q[2], Inf), 
                      right=FALSE, labels=c("Low","Medium","High"))

CoR analysis by low, medium, high average titer

data <- subset(data, VACC==1)
dataI <- subset(data, !is.na(cat.AUCMB))

fit <- cch(Surv(oftime_m13, ofstatus_m13) ~ cat.AUCMB + MALE + AGE + COUNTRY,
           subcoh = dataI$FASI=="Y",
           id = dataI$SUBJID,
           cohort.size = NROW(data),
           data = dataI,
           method = "LinYing")

sfit <- summary(fit)$coef

CoR analysis by low, medium, high average titer

# estimated hazard ratios med vs. low and high vs. low
exp(sfit[1:2,1])

## cat.AUCMBMedium   cat.AUCMBHigh 
##       0.3695371       0.1003635

# Wald 95% CIs for the hazard ratio
exp(sfit[1:2,1] + qnorm(0.975) * sfit[1:2,2] %o% c(-1,1))

##                      [,1]      [,2]
## cat.AUCMBMedium 0.2312103 0.5906211
## cat.AUCMBHigh   0.0522652 0.1927254

# p-values from 2-sided Wald tests of 
# H0: HR(med vs. low) = 1 and H0: HR(high vs. low) = 1
sfit[1:2,4]

## cat.AUCMBMedium   cat.AUCMBHigh 
##    3.169712e-05    4.988898e-12

CoR analysis by low, medium, high average titer

# Wald test statistic for testing
# H0: HR(med vs. low) = HR(high vs. low) = 1
stat <- drop(t(fit$coef[1:2]) %*% solve(fit$var[1:2,1:2])
             %*% fit$coef[1:2])
1 - pchisq(stat, df = 2)

## [1] 2.847167e-11

CoR analysis by low, medium, high serotype-specific titers

# p-values from 2-sided Wald tests of
# H0: HR(med vs. low) = HR(high vs. low) = 1 for 
# serotype-specific endpoints/titers
p <- c(pSero1, pSero2, pSero3, pSero4)
round(p, digits = 5)

## [1] 0.00075 0.00123 0.29070 0.21186

# Holm-adjusted p-values
round(p.adjust(p, method = "holm"), digits = 5)

## [1] 0.00298 0.00368 0.42373 0.42373

R documentation: `cch` in R package `survival`

Summary of key R functions

cch in R package survival
summary
wtd.quantile in R package Hmisc
cut
p.adjust

CoR analysis by quantitative average titer

CoR analysis by quantitative average titer

CoR analysis by quantitative average titer

CoR analysis by quantitative average titer

CoR analysis by quantitative average titer

CoR analysis by quantitative average titer

CoR analysis by quantitative serotype-specific titer

CoR analysis by low, medium, high average titer

CoR analysis by low, medium, high average titer

CoR analysis by low, medium, high average titer

CoR analysis by low, medium, high average titer

CoR analysis by low, medium, high average titer

CoR analysis by low, medium, high average titer

CoR analysis by low, medium, high serotype-specific titers

R documentation: cch in R package survival

Summary of key R functions

R documentation: `cch` in R package `survival`