R package pssmooth: Flexible and Efficient Evaluation of Principal Surrogates / Treatment Effect Modifiers

September 24???26, 2018

Data: 9–16 year-olds at risk for VCD at month 13 in CYD14 + CYD15

head(data)

##    vax  age country bTiter m13Titer VCDpostM13 wts
## 2    1  >11     IDN     NA       NA          0  NA
## 6    1 <=11     IDN     NA       NA          0  NA
## 7    1 <=11     IDN     NA       NA          0  NA
## 8    1  >11     IDN     NA       NA          0  NA
## 10   0 <=11     IDN     NA       NA          0  NA
## 12   1 <=11     IDN     NA       NA          0  NA

Data: 9-16 year–olds at risk for VCD at month 13 in CYD14 + CYD15

head(subset(data, !is.na(bTiter)))

##     vax  age country   bTiter m13Titer VCDpostM13      wts
## 128   0 <=11     IDN 1.841060 1.464475          0 5.116551
## 129   0  >11     IDN 0.698970 0.698970          0 5.116551
## 132   0 <=11     IDN 2.511435 2.059343          0 5.116551
## 133   1 <=11     IDN 2.425587 2.626717          0 5.116551
## 134   1 <=11     IDN 2.431748 2.844152          0 5.116551
## 135   1 <=11     IDN 0.698970 1.410553          1 1.004098

R package `pssmooth`: inference about \(\, \mathop{\mathrm{mCEP}}(s_1)\) curves

riskCurve: calculates point estimates of \(P(Y(z)=1 | S(1)=s_1)\), \(z=0,1\)
bootRiskCurve: calculates bootstrap estimates of \(P(Y(z)=1 | S(1)=s_1)\), \(z=0,1\)
summary: tabulates point and interval estimates of \(\mathop{\mathrm{mCEP}}(s_1)\)
plotMCEPcurve: plots point and interval estimates of \(\mathop{\mathrm{mCEP}}(s_1)\)
testConstancy: tests \(H_0^1\) or \(H_0^2\)
testEquality: tests \(H_0^3\) or \(H_0^4\)
Parallel computing

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

psRange <- range(data$m13Titer, na.rm = TRUE)
psGrid <- seq(psRange[1], psRange[2], length.out = 100)

library(pssmooth)
out1.riskCurve <- 
  riskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
            bsm = "bTiter", 
            tx = "vax", 
            data = data, 
            psGrid = psGrid)

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

names(out1.riskCurve)

## [1] "psGrid"         "plaRiskCurve"   "txRiskCurve"    "fOptBandwidths"
## [5] "gOptBandwidths"

out1.riskCurve$psGrid[seq(1, 100, by = 25)]

## [1] 0.698970 1.619664 2.540358 3.461051

out1.riskCurve$plaRiskCurve[seq(1, 100, by = 25)]

## [1] 0.07491339 0.07375328 0.03465637 0.02251459

out1.riskCurve$txRiskCurve[seq(1, 100, by = 25)]

## [1] 0.125437157 0.035485932 0.009349333 0.002415030

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

out2.riskCurve <- 
  riskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
            bsm = "bTiter", 
            tx = "vax", 
            data = data,
            bwtype = "generalized_nn"
            psGrid = psGrid)

out3.riskCurve <- 
  riskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
            bsm = "bTiter", 
            tx = "vax", 
            data = data,
            hinge = TRUE,
            psGrid = psGrid)

out4.riskCurve <- 
  riskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
            bsm = "bTiter", 
            tx = "vax", 
            data = data, 
            hinge = TRUE,
            weights = "wts",
            psGrid = psGrid)

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

data$bTiterC <- cut(data$bTiter, breaks = c(-Inf, log10(5), 1:5), labels=FALSE)
table(data$bTiterC)

## 
##    1    2    3    4    5    6 
##  568   79  535 1431  213    2

data$m13TiterC <- cut(data$m13Titer, breaks = c(-Inf, log10(5), 1:5), labels=FALSE)
table(data$m13TiterC)

out5.riskCurve <- 
  riskCurve(formula = VCDpostM13 ~ m13TiterC + factor(age) + factor(country), 
            bsm = "bTiterC", 
            tx = "vax", 
            data = data,
            pstype = "ordered",
            bsmtype = "ordered",
            hinge = TRUE,
            psGrid = sort(unique(data$m13TiterC)))

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Bootstrap estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

out4.riskCurve <- 
  riskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
            bsm = "bTiter", 
            tx = "vax", 
            data = data, 
            hinge = TRUE,
            weights = "wts",
            psGrid = psGrid)

out4.bootRiskCurve <-
  bootRiskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
                bsm = "bTiter", 
                tx = "vax", 
                data = data,
                hinge = TRUE,
                weights = "wts",
                psGrid = psGrid,
                iter = 500,
                seed = 9)

Bootstrap estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

names(out4.bootRiskCurve)

## [1] "psGrid"           "plaRiskCurveBoot" "txRiskCurveBoot" 
## [4] "cpointPboot"      "cpointTboot"

out4.bootRiskCurve$psGrid[c(1, 25, 50, 73)]

## [1] 0.69897 1.89897 3.14897 4.29897

out4.bootRiskCurve$plaRiskCurveBoot[c(1, 25, 50, 73), 1:5]

##            [,1]       [,2]       [,3]       [,4]       [,5]
## [1,] 0.05858134 0.06466535 0.06039068 0.05671162 0.06807973
## [2,] 0.06427763 0.05405423 0.06489546 0.05948239 0.05704471
## [3,] 0.02618514 0.02103763 0.02850078 0.02860941 0.01639328
## [4,] 0.02056239 0.01335641 0.02347063 0.02484906 0.01458935

Summary of estimation of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

summary(out4.riskCurve, out4.bootRiskCurve, contrast = "te")[c(1, 25, 50, 73),]

##     psGrid        te      ptLB      ptUB       smLB      smUB
## 1  0.69897 0.3538459 0.1763504 0.4930913 0.09006256 0.5411606
## 25 1.89897 0.4969972 0.3970106 0.5804042 0.35042102 0.6104988
## 50 3.14897 0.8904173 0.8396984 0.9250889 0.81260821 0.9359184
## 73 4.29897 0.9867314 0.9780499 0.9919793 0.97301160 0.9934766

summary(out4.riskCurve, out4.bootRiskCurve, contrast = "rd")[c(1, 25, 50, 73),]

##     psGrid         rd        ptLB       ptUB        smLB       smUB
## 1  0.69897 0.02262658 0.007442722 0.03781043 0.003134043 0.04211911
## 25 1.89897 0.03095291 0.019851007 0.04205482 0.016700651 0.04520517
## 50 3.14897 0.02164375 0.013176828 0.03011067 0.010774194 0.03251330
## 73 4.29897 0.02001904 0.010738979 0.02929909 0.008105604 0.03193247

Plotting of estimates of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

plotMCEPcurve(summary(out4.riskCurve, out4.bootRiskCurve, contrast = "te"), 
              hingePoint = with(out4.riskCurve, round(min(cpointP, cpointT), 1)), 
              xLab = "Average Log10 M13 Titer in Vaccinees", 
              yLab = "Vaccine Efficacy")

Plotting of estimates of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

plotMCEPcurve(summary(out4.riskCurve, out4.bootRiskCurve, contrast = "logrr"), 
              hingePoint = with(out4.riskCurve, round(min(cpointP, cpointT), 1)), 
              xLab = "Average Log10 M13 Titer in Vaccinees", 
              yLab = "Log Relative Risk")

Plotting of estimates of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

plotMCEPcurve(summary(out4.riskCurve, out4.bootRiskCurve, contrast = "rd"), 
              hingePoint = with(out4.riskCurve, round(min(cpointP, cpointT), 1)), 
              xLab = "Average Log10 M13 Titer in Vaccinees", 
              yLab = "Risk Difference (Placebo - Vaccine)")

Test of \(\, \{H^1_0: \mathop{\mathrm{mCEP}}(s_1) \equiv CE\) for all \(\, s_1 \in \mathbb{S}\}\)

First, we need to obtain \(\widehat{CE}\):

fit <- glm(VCDpostM13 ~ vax, data = data, family = binomial)
overallRisk <- predict(fit, newdata = data.frame(vax = 0:1), type = "response")
overallRisk

##          1          2 
## 0.03523634 0.01299404

Test of \(\, \{H^1_0: \mathop{\mathrm{mCEP}}(s_1) \equiv CE\) for all \(\, s_1 \in \mathbb{S}\}\)

testConstancy(object = out4.riskCurve, 
              boot = out4.bootRiskCurve,
              contrast = "te",
              null = "H01",
              overallPlaRisk = overallRisk[1],
              overallTxRisk = overallRisk[2])

## [1] 0

Test of \(\, \{H^1_0: \mathop{\mathrm{mCEP}}(s_1) \equiv CE\) for all \(\, s_1 \in \mathbb{S}\}\)

testConstancy(object = out4.riskCurve, 
              boot = out4.bootRiskCurve,
              contrast = "rd",
              null = "H01",
              overallPlaRisk = overallRisk[1],
              overallTxRisk = overallRisk[2])

## [1] 0.162

Test of \(\, \{H^2_0: \mathop{\mathrm{mCEP}}(s_1) \equiv c\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\) and a known constant \(\, c \in \mathbb{R}\}\)

testConstancy(object = out4.riskCurve, 
              boot = out4.bootRiskCurve,
              contrast = "te",
              null = "H02",
              MCEPconstantH02 = 0,
              limS1 = c(0.6, with(out4.riskCurve, 
                        min(cpointP, cpointT))))

## [1] 0

testConstancy(object = out4.riskCurve, 
              boot = out4.bootRiskCurve,
              contrast = "te",
              null = "H02",
              MCEPconstantH02 = 0.3,
              limS1 = c(0.6, with(out4.riskCurve, 
                        min(cpointP, cpointT))))

## [1] 0.265

Test of \(\, \{H^2_0: \mathop{\mathrm{mCEP}}(s_1) \equiv c\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\) and a known constant \(\, c \in \mathbb{R}\}\)

testConstancy(object = out4.riskCurve, 
              boot = out4.bootRiskCurve,
              contrast = "rd",
              null = "H02",
              MCEPconstantH02 = 0,
              limS1 = c(0.6, with(out4.riskCurve, 
                        min(cpointP, cpointT))))

## [1] 0

testConstancy(object = out4.riskCurve, 
              boot = out4.bootRiskCurve,
              contrast = "rd",
              null = "H02",
              MCEPconstantH02 = 0.02,
              limS1 = c(0.6, with(out4.riskCurve, 
                        min(cpointP, cpointT))))

## [1] 0.498

Test of \(\, \{H^3_0: \mathop{\mathrm{mCEP}}_1(s_1) = \mathop{\mathrm{mCEP}}_2(s_1)\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\}\) for two different biomarkers or endpoints or both

testEquality(object1 = denv3.riskCurve,
             object2 = denv4.riskCurve,
             boot1 = denv3.bootRiskCurve,
             boot2 = denv4.bootRiskCurve,
             contrast = "te",
             null = "H03")

testEquality(object1 = denv3.riskCurve,
             object2 = denv4.riskCurve,
             boot1 = denv3.bootRiskCurve,
             boot2 = denv4.bootRiskCurve,
             contrast = "rd",
             null = "H03")

Test of \(\, \{H^4_0: \mathop{\mathrm{mCEP}}(s_1\mid X=1) =\) \(\, \mathop{\mathrm{mCEP}}(s_1 \mid X=0)\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\}\) for a baseline dichotomous phase 1 covariate \(\, X\)

testEquality(object1 = cyd14.riskCurve,
             object2 = cyd15.riskCurve,
             boot1 = cyd14.bootRiskCurve,
             boot2 = cyd15.bootRiskCurve,
             contrast = "te",
             null = "H04")

testEquality(object1 = cyd14.riskCurve,
             object2 = cyd15.riskCurve,
             boot1 = cyd14.bootRiskCurve,
             boot2 = cyd15.bootRiskCurve,
             contrast = "rd",
             null = "H04")

Parallel computing

Problem:

out4.bootRiskCurve <-
  bootRiskCurve(formula = VCDpostM13 ~ m13Titer + factor(age) + factor(country), 
                bsm = "bTiter", 
                tx = "vax", 
                data = data,
                hinge = TRUE,
                weights = "wts",
                psGrid = psGrid,
                iter = 500,
                seed = 9)

would take too long (i.e., weeks or months) to run

Parallel computing

Many parallel iterations using the R package rslurm
slurm_apply creates a folder of R files for independent and parallel execution on a cluster:

slurm_apply(bootRiskCurve,        # a function that executes a single iteration
            pars,                 # a data frame with input arguments for bootRiskCurve
            jobname = "bootOut",  # an output directory name
            nodes = 500)          # number of bootstrap iterations

Results from each completed iteration (with a different unique seed) written in a separate RDS file:
- A failed iteration (e.g., due to a cluster or code error) does not interrupt other iterations in progress
- Partial results available for examination
- Partial results allow to project the time to completion
get_slurm_out combines all RDS files into a single data frame

Summary of key R functions

riskCurve
bootRiskCurve
summary
plotMCEPcurve
testConstancy
testEquality

pssmooth, version 1.0.1, downloadable on CRAN:

https://cran.r-project.org/package=pssmooth

Code development:

https://github.com/mjuraska/pssmooth

Data: 9–16 year-olds at risk for VCD at month 13 in CYD14 + CYD15

Data: 9-16 year–olds at risk for VCD at month 13 in CYD14 + CYD15

R package pssmooth: inference about \(\, \mathop{\mathrm{mCEP}}(s_1)\) curves

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Point estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Bootstrap estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Bootstrap estimation of \(\, P(Y(z)=1 \mid S(1)=s_1)\)

Summary of estimation of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

Plotting of estimates of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

Plotting of estimates of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

Plotting of estimates of \(\, \mathop{\mathrm{mCEP}}(s_1)\)

Test of \(\, \{H^1_0: \mathop{\mathrm{mCEP}}(s_1) \equiv CE\) for all \(\, s_1 \in \mathbb{S}\}\)

Test of \(\, \{H^1_0: \mathop{\mathrm{mCEP}}(s_1) \equiv CE\) for all \(\, s_1 \in \mathbb{S}\}\)

Test of \(\, \{H^1_0: \mathop{\mathrm{mCEP}}(s_1) \equiv CE\) for all \(\, s_1 \in \mathbb{S}\}\)

Test of \(\, \{H^2_0: \mathop{\mathrm{mCEP}}(s_1) \equiv c\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\) and a known constant \(\, c \in \mathbb{R}\}\)

Test of \(\, \{H^2_0: \mathop{\mathrm{mCEP}}(s_1) \equiv c\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\) and a known constant \(\, c \in \mathbb{R}\}\)

Test of \(\, \{H^3_0: \mathop{\mathrm{mCEP}}_1(s_1) = \mathop{\mathrm{mCEP}}_2(s_1)\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\}\) for two different biomarkers or endpoints or both

Test of \(\, \{H^4_0: \mathop{\mathrm{mCEP}}(s_1\mid X=1) =\) \(\, \mathop{\mathrm{mCEP}}(s_1 \mid X=0)\) for all \(\, s_1 \in \mathbb{S}_1 \subseteq \mathbb{S}\}\) for a baseline dichotomous phase 1 covariate \(\, X\)

Parallel computing

Parallel computing

Summary of key R functions

R package `pssmooth`: inference about \(\, \mathop{\mathrm{mCEP}}(s_1)\) curves