Question

我想使用boot.ci()计算多阶段引导程序的BCa置信区间。以下是一个示例：Non-parametric bootstrapping on the highest level of clustered data using boot() function from {boot} in R 它使用boot命令。

# creating example df
rho <- 0.4
dat <- expand.grid(
  trial=factor(1:5),
  subject=factor(1:3)
)
sig <- rho * tcrossprod(model.matrix(~ 0 + subject, dat))
diag(sig) <- 1
set.seed(17); dat$value <- chol(sig) %*% rnorm(15, 0, 1)

# function for resampling
resamp.mean <- function(dat, 
                    indices, 
                    cluster = c('subject', 'trial'), 
                    replace = TRUE){
  cls <- sample(unique(dat[[cluster[1]]]), replace=replace)
  sub <- lapply(cls, function(b) subset(dat, dat[[cluster[1]]]==b))
  sub <- do.call(rbind, sub)
  mean(sub$value)
} 

dat.boot <- boot(dat, resamp.mean, 4) # produces and estimated statistic

boot.ci(data.boot) # produces errors

如何在boot.ci输出上使用boot？

Answer 1

您使用的引导程序重采样太少了。当您致电boot.ci时，需要影响度量，如果没有提供，则会从empinf获取，这可能会因观察次数过少而失败。有关类似的解释，请参阅here。

尝试

dat.boot <- boot(dat, resamp.mean, 1000) 
boot.ci(dat.boot, type = "bca")

给出：

> boot.ci(dat.boot, type = "bca") 

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL : 
boot.ci(boot.out = dat.boot, type = "bca")

Intervals : 
Level       BCa          
95%   (-0.2894,  1.2979 )  
Calculations and Intervals on Original Scale
Some BCa intervals may be unstable

作为替代方案，您可以自己提供L（影响力度量）。

# proof of concept, use appropriate value for L!
> dat.boot <- boot(dat, resamp.mean, 4)
> boot.ci(dat.boot, type = "bca", L = 0.2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 4 bootstrap replicates

CALL : 
boot.ci(boot.out = dat.boot, type = "bca", L = 0.2)

Intervals : 
Level       BCa          
95%   ( 0.1322,  1.2979 )  
Calculations and Intervals on Original Scale
Warning : BCa Intervals used Extreme Quantiles
Some BCa intervals may be unstable

来自分层引导的置信区间

1 个答案: