我正在尝试使用R中的gbm创建一个模型,但是我收到了错误。
train=read.csv("~/Downloads/train-3.csv", stringsAsFactors=FALSE)
test=read.csv("~/Downloads/test-3.csv")
head(train)
randomSeed = 1738
set.seed(randomSeed)
library(gbm)
LogLossBinary = function(actual, predicted, eps = 1e-15) {
predicted = pmin(pmax(predicted, eps), 1-eps)
- (sum(actual * log(predicted) + (1 - actual) * log(1 - predicted))) / length(actual)
}
gbmModel = gbm(formula = Retweet_count ~ Text + Time,
distribution = "bernoulli",
data = train2,
n.trees = 2500,
shrinkage = .01,
n.minobsinnode = 20)
错误就是这个
Error in gbm.fit(x, y, offset = offset, distribution = distribution, w = w, :
gbm does not currently handle categorical variables with more than 1024 levels. Variable 1:
Text has 2000 levels.
什么级别以及解决此问题的方法是什么,仍然可以使用gbm?