如何在随机森林模型R中获取用于每棵树的OOB样本?

时间:2017-12-09 13:02:39

标签: r random-forest

是否有可能为每棵树获取随机森林算法使用的OOB样本? 我正在使用R语言。 我知道RandomForest算法使用了近66%的数据(随机选择)来成长每棵树,34%的数据作为OOB样本来测量OOB错误,但我不知道如何获取这些OOB样本每棵树?

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

Assuming you are using the remainingPhotos[position] = { Key: originalPhotoArray[i], Thumb: arrayPhotoThumb[i], Title: titleArray[i], Width: widthArray[i], Height: heightArray[i] }; package, you just need to set the randomForest argument to keep.inbag.

TRUE

The output list will contain an n by ntree matrix that can be accessed by the name library(randomForest) set.seed(1) rf <- randomForest(Species ~ ., iris, keep.inbag = TRUE) .

inbag

The values in the matrix tell you how many times a sample was in-bag. For example, the value of 2 in row 5 column 3 above says that the 5th observation was included in-bag twice for the 3rd tree.

As a bit of background here, a sample can show up in-bag more than once (hence the 2) because by default the sampling is done with replacement.

You can also sample without replacement via the dim(rf$inbag) # [1] 150 500 rf$inbag[1:5, 1:3] # [,1] [,2] [,3] # 1 0 1 0 # 2 1 1 0 # 3 1 0 1 # 4 1 0 1 # 5 0 0 2 parameter.

replace

And now we can verify that without replacement, the maximum number of times any sample is included is once.

set.seed(1)
rf2 <- randomForest(Species ~ ., iris, keep.inbag = TRUE, replace = FALSE)