R gbm分类变量错误

时间:2017-04-09 17:07:29

标签: r levels gbm

我正在尝试使用R中的gbm创建一个模型,但是我收到了错误。

train=read.csv("~/Downloads/train-3.csv", stringsAsFactors=FALSE)
test=read.csv("~/Downloads/test-3.csv")
head(train)
randomSeed = 1738
set.seed(randomSeed)
library(gbm)

LogLossBinary = function(actual, predicted, eps = 1e-15) {  
  predicted = pmin(pmax(predicted, eps), 1-eps)  
  - (sum(actual * log(predicted) + (1 - actual) * log(1 - predicted))) / length(actual)
}
gbmModel = gbm(formula = Retweet_count ~ Text + Time,
           distribution = "bernoulli",
           data = train2,
           n.trees = 2500,
           shrinkage = .01,
           n.minobsinnode = 20)

错误就是这个

Error in gbm.fit(x, y, offset = offset, distribution = distribution, w = w,  : 
gbm does not currently handle categorical variables with more than 1024 levels. Variable 1: 
Text has 2000 levels.

什么级别以及解决此问题的方法是什么,仍然可以使用gbm?

0 个答案:

没有答案