使用R naive bayes e1702

时间:2015-09-24 03:05:44

标签: r statistics modeling naivebayes

Is this code correct?
library(e1701) ## Categorical data only:
data(HouseVotes84, package = "mlbench")
model <-
naiveBayes(Class ~ ., data = HouseVotes84)
a<-c("n","y","n","y","n","n","y","y","n","n")
names(a)<-c("V1","V2","V3","V4","V5","V6","V7","V8","V9","V10")
pred<-predict(model,a)
tab<-table(pred,a)
sum(tab[row(tab)==col(tab)])/sum(tab)

我想根据使用模型

的投票记录进行预测

1 个答案:

答案 0 :(得分:1)

很难确切地知道你的意图,但似乎你想根据他或她的Class值来预测该立法者的一方(V1:V10)。如果是这样,那么这就是你想要的:

library(e1071) 
data(HouseVotes84, package = "mlbench")
model <- naiveBayes(Class ~ ., data = HouseVotes84)
a <- data.frame(matrix(c("n","y","n","y","n","n","y","y","n","n"), nrow = 1))
names(a) <- c("V1","V2","V3","V4","V5","V6","V7","V8","V9","V10")
(pred <- predict(model, a))
# [1] democrat
# Levels: democrat republican
(pred <- predict(model, a, type = "raw"))
#       democrat republican
# [1,] 0.9277703  0.0722297

您提供的代码有两个错误:首先,您没有正确加载包含naiveBayes()的包,因为名称实际上是e1071;第二,您没有向newdata中的predict()提供正确的项目。这需要一个data.frame,你正在为它提供一个向量,在这里被视为10个观察,每个都提供一个特性:第一个是V1,第二个是V2,等等。nativeBayes()不关心如果你提供了一个不完整的功能列表,那么它仍然可以工作:

> pred
 [1] democrat democrat democrat democrat democrat democrat democrat democrat democrat democrat
Levels: democrat republican
> (pred <- predict(model,a, type = "raw"))
       democrat republican
 [1,] 0.6137931  0.3862069
 [2,] 0.6137931  0.3862069
 [3,] 0.6137931  0.3862069
 [4,] 0.6137931  0.3862069
 [5,] 0.6137931  0.3862069
 [6,] 0.6137931  0.3862069
 [7,] 0.6137931  0.3862069
 [8,] 0.6137931  0.3862069
 [9,] 0.6137931  0.3862069
[10,] 0.6137931  0.3862069

但是在这里你会得到十个没有信息的预测,因为你只有一个功能可以用来预测每个预测。这就是为什么预测匹配与之前的匹配,因为您几乎没有数据更新:

# prior class probabilities (with no model)
> prop.table(table(HouseVotes84$Class))

  democrat republican 
 0.6137931  0.3862069 

在上面的更正代码中,使用更多功能来预测这个新(单一)观察的Class和10个投票特征的数据,我们对这个立法者是民主党人有更自信的预测,因为后验概率基于更多数据来更新先前类概率0.61和0.39。