使用的数据(样本)
EMpn Target Dailyrate DfrmHom
Empnumber Target Dailyrate DFrmHome Education Employeecount Age HourlyRate Jobinvolvment Joblevel
1 0 1 1 1 -1 1 1 1 1
2 1 1 1 1 1 1 1 1 1
3 0 1 -1 1 1 -1 1 1 1
4 1 1 1 1 1 1 1 1 1
5 0 1 -1 1 1 -1 1 1 1
6 1 1 -1 1 1 -1 1 -1 1
7 0 1 1 1 -1 1 1 1 1
8 1 1 1 1 1 1 1 1 1
9 1 1 -1 1 1 -1 1 1 1
10 1 1 1 1 1 1 1 1 1
11 0 1 -1 1 -1 -1 1 1 1
12 0 1 1 1 -1 1 1 1 1
13 0 1 1 1 1 1 1 1 1
14 1 1 1 1 1 -1 1 1 1
15 1 1 1 1 1 1 1 1 1
16 1 1 -1 1 1 -1 1 -1 1
17 0 1 1 1 1 -1 1 1 1
18 0 1 -1 1 1 -1 1 1 1
19 0 1 1 1 -1 1 1 1 1
20 0 1 -1 1 1 -1 1 1 1
21 0 1 1 1 -1 1 1 1 1
22 0 1 1 1 -1 1 1 1 1

Rcode :
library(NeuralNetTools)
#Read the data for train and test data
CTDF = read.csv("Project-Attrition.csv",header=T,na.strings=c(""))
# Let us set differnt seeds and extract 70 % of Poulation for arriving
at Development, Test and Holdout samples
CTDF.dev = sample.split(CTDF$Target,SplitRatio=0.70)
head(split,20)
CTDF.train= subset(CTDF, split== TRUE)
str(CTDF.train)
View(CTDF.train)
table(CTDF.train$Target)
CTDF.test= subset(CTDF, split== FALSE)
str(CTDF.test)
table(CTDF.test$Target)
CTDF.NND=CTDF.train
CTDF.NNT=CTDF.test
Model = =nnet(Target~Dailyrate+DFrmHome +
+Education+Employeecount+Age+HourlyRate+Jobinvolvment+Joblevel,
data=CTDF.NND, size=30, rang= 0.1, decay = 5e-4 , maxit = 500)
table(Actual=CTDF.NND$Target, Prediction=predict(Model, data=CTDF.NND))
CTDF.NNT$predict.class = predict(Model, CTDF.NNT)
confusionMatrix(CTDF.NNT$predict.class, CTDF.NNT$Target)
Output :
str(CTDF.train)
'data.frame': 3167 obs. of 10 variables:
$ Empnumber : int 1 2 4 5 6 7 11 12 14 15 ...
$ Target : int 0 1 1 0 1 0 0 0 1 1 ...
$ Dailyrate : int 1 1 1 1 1 1 1 1 1 1 ...
$ DFrmHome : int 1 1 1 -1 -1 1 -1 1 1 1 ...
$ Education : int 1 1 1 1 1 1 1 1 1 1 ...
$ Employeecount: int -1 1 1 1 1 -1 -1 -1 1 1 ...
$ Age : int 1 1 1 -1 -1 1 -1 1 -1 1 ...
$ HourlyRate : int 1 1 1 1 1 1 1 1 1 1 ...
$ Jobinvolvment: int 1 1 1 1 -1 1 1 1 1 1 ...
$ Joblevel : int 1 1 1 1 1 1 1 1 1 1 ...
For" table(Actual = CTDF.NND $ Target,Prediction = predict(Model,data = CTDF.NND))"
I was expecting a confusion matrix like below.
0 1
0
1
但我得到的是:
Prediction
实际0.00892725347390036 0.0358546806229366 0.0376897872686173 0.10518235583921 0.124456456913317 0 3 242 1 34 197 1 0 9 0 4 28 预测 实际0.1363541434416 0.236290782608584 0.286744920923175 0.331511427682813 0.613818492677834 0.726882504157994 0 19 42 5 474 51 6 1 3 13 2 235 81 16 预测
对于计算混淆矩阵的测试数据,我收到以下错误:
confusionMatrix(CTDF.NNT$predict.class, CTDF.NNT$Target)
confusionMatrix.default(CTDF.NNT $ predict.class,CTDF.NNT $ Target)出错: 数据的级别不能超过参考
请帮助克服这些问题。