The Bootstrap

Eric Zivot

Monday, May 11, 2015

Set options and load packages

options(digits=3, width=70)
library(boot)  
library(IntroCompFinR)
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(PerformanceAnalytics)
## 
## Attaching package: 'PerformanceAnalytics'
## 
## The following object is masked from 'package:graphics':
## 
##     legend

R function sample()

Consider random sampling from the vector r=c(0.1,0.05,-0.02,0.03,-0.04).

r = c(0.1,0.05,-0.02,0.03,-0.04)
set.seed(123)
idx = sample(5, replace=TRUE)
idx
## [1] 2 4 3 5 5
r[idx]
## [1]  0.05  0.03 -0.02 -0.04 -0.04

R function sample()

set.seed(123)
sample(r, replace=TRUE)
## [1]  0.05  0.03 -0.02 -0.04 -0.04

Example Data: monthly cc returns on MSFT

my.panel <- function(...) {
  lines(...)
  abline(h=0)
}
plot.zoo(msftRetC, col="blue", lwd=2, main="", panel=my.panel, type="l")

Put xts data in matrix for use with R package boot

returns.mat = as.matrix(cerRetC)
MSFT = returns.mat[,"MSFT", drop=FALSE]

Estimate CER model parameters for MSFT

Compute CER model estimates for MSFT

n.obs = nrow(MSFT)
muhat.MSFT = mean(MSFT)
sigmahat.MSFT = sd(MSFT)
estimate = c(muhat.MSFT, sigmahat.MSFT)

Calculate analytic standard errors

se.muhat.MSFT = sigmahat.MSFT/sqrt(n.obs)
se.sigmahat.MSFT = sigmahat.MSFT/sqrt(2*n.obs)
se = c(se.muhat.MSFT, se.sigmahat.MSFT)

Show estimates with analytic standard errors

ans = rbind(estimate, se)
colnames(ans) = c("Mu", "Sigma")
ans
##               Mu  Sigma
## estimate 0.00413 0.1002
## se       0.00764 0.0054

Brute force bootstrap

Same idea as Monte Carlo Simulation but instead of generating random data from an assumed distribution or model, you generate random data by sampling with replacement from the observed data using sample().

Example: Bootstrap \(\hat{\mu}_{MSFT}\)

# Create bootstrap simulation for muhat
B = 999
muhat.boot = rep(0, B)
n.obs = nrow(MSFT)
set.seed(123)
for (i in 1:B) {
  boot.data = sample(MSFT, n.obs, replace=TRUE)
  muhat.boot[i] = mean(boot.data)
}

Bootstrap bias estimate

Calculate bootstrap bias estimate

mean(muhat.boot) - muhat.MSFT
## [1] 0.000199

Bootstrap bias is close to zero

Bootstrap SE

Calculate bootstrap SE and compare with analytic SE

sd(muhat.boot)
## [1] 0.00746
# analytic SE
se.muhat.MSFT
## [1] 0.00764

bootstrap SE is very close to analytic SE.

Bootstrap distribution

# plot bootstrap distribution and show normal qq-plot
par(mfrow=c(1,2))
  hist(muhat.boot, col="cornflowerblue")
  abline(v=muhat.MSFT, col="white", lwd=2)
  qqnorm(muhat.boot, col="cornflowerblue")
  qqline(muhat.boot)

par(mfrow=c(1,1))

Bootstrap distribution looks like normal distribution.

Bootstrap 95% confidence interval

Calculate bootstrap 95% confidence interval based on normal distribution

se.boot = sd(muhat.boot)
lower = muhat.MSFT - 2*se.boot
upper = muhat.MSFT + 2*se.boot
width = upper - lower
c(lower, upper, width)
## [1] -0.0108  0.0190  0.0298

R package boot

Using the boot() function

The function boot() requires user-supplied functions that take two arguments: data and an index. The index is created by the boot() function and represents random resampling with replacement

# function for bootstrapping sample mean
mean.boot = function(x, idx) {
# arguments:
# x      data to be resampled
# idx    vector of scrambled indices created by boot() function
# value:
# ans        mean value computed using resampled data
     ans = mean(x[idx])
     ans
}

Using the boot() function

# bootstrapping muhat
MSFT.muhat.boot = boot(MSFT, statistic = mean.boot, R=999)
class(MSFT.muhat.boot)
## [1] "boot"
names(MSFT.muhat.boot)
##  [1] "t0"        "t"         "R"         "data"      "seed"     
##  [6] "statistic" "sim"       "call"      "stype"     "strata"   
## [11] "weights"
MSFT.muhat.boot
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = MSFT, statistic = mean.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original   bias    std. error
## t1*  0.00413 5.71e-05     0.00777

Here, “original” is \(\hat{\mu}\); “bias” is bootstrap bias; “std. error” is the bootstrap SE. Recall, the analytic SE is 0.008.

Plot method for “boot” objects

Plot method shows histogram of bootstrap values and normal qq-plot.

plot(MSFT.muhat.boot)

Bootstrap confidence intervals

Use the function boot.ci() to compute bootstrap confidence intervals

boot.ci(MSFT.muhat.boot, conf = 0.95, type = c("norm","perc"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = MSFT.muhat.boot, conf = 0.95, type = c("norm", 
##     "perc"))
## 
## Intervals : 
## Level      Normal             Percentile     
## 95%   (-0.0112,  0.0193 )   (-0.0111,  0.0193 )  
## Calculations and Intervals on Original Scale

Bootstrapping \(\hat{\sigma}\) for MSFT

# function for bootstrapping sample standard deviation
sd.boot = function(x, idx) {
# arguments:
# x     data to be resampled
# idx       vector of scrambled indices created by boot() function
# value:
# ans       sd value computed using resampled data
     ans = sd(x[idx])
     ans
}
# bootstrap sigmahat
MSFT.sigmahat.boot = boot(MSFT, statistic = sd.boot, R=999)
MSFT.sigmahat.boot
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = MSFT, statistic = sd.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original    bias    std. error
## t1*      0.1 -0.000526     0.00794

Recall, the analytic SE for sigmahat is 0.005. Here the bootstrap SE should be more accurate because the analytic SE is only an approximation based on the central limit theorem.

Bootstrap distribution

# plot bootstrap distribution
plot(MSFT.sigmahat.boot)

Bootstrap distribution looks pretty normal

Bootstrap 95% confidence intervals

# compute 95% confidence intervals
boot.ci(MSFT.sigmahat.boot, conf=0.95, type=c("norm", "perc"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = MSFT.sigmahat.boot, conf = 0.95, type = c("norm", 
##     "perc"))
## 
## Intervals : 
## Level      Normal             Percentile     
## 95%   ( 0.0852,  0.1163 )   ( 0.0843,  0.1167 )  
## Calculations and Intervals on Original Scale

Bootstrapping normal VaR for MSFT

Create function to compute normal Value-at-Risk to be passed to boot()

ValueAtRisk.boot = function(x, idx, p=0.05, w=100000) {
# x     data to be resampled
# idx       vector of scrambled indices created by boot() function
# p       probability value for VaR calculation
# w       value of initial investment
# value:
# ans       Value-at-Risk computed using resampled data

    q = mean(x[idx]) + sd(x[idx])*qnorm(p)
    VaR = (exp(q) - 1)*w
    VaR
}

Bootstrapping normal VaR

MSFT.VaR.boot = boot(MSFT, statistic = ValueAtRisk.boot, R=999)
MSFT.VaR.boot
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = MSFT, statistic = ValueAtRisk.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*   -14846  -0.534        1299

Bootstrapping normal VaR

plot(MSFT.VaR.boot)

boot.ci(MSFT.VaR.boot, conf=0.95, type=c("norm", "perc"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = MSFT.VaR.boot, conf = 0.95, type = c("norm", 
##     "perc"))
## 
## Intervals : 
## Level      Normal             Percentile     
## 95%   (-17392, -12299 )   (-17396, -12274 )  
## Calculations and Intervals on Original Scale

95% confidence interval for 5% VaR is fairly wide.