错误“ train”和“ class”的长度不同

时间:2019-02-23 00:58:30

标签: r dataframe vector knn predict

我一直试图使用KNN函数来开始预测,但是当我运行代码时,它将引发错误:

  

knn(data.frame(tr5_train),data.frame(tr5_test),cl = pred_train_labels 、:'train'和'class'的长度不同

我检查了所有数据集是否为data.frame,并尝试将标签用作向量,但没有成功

以下是我使用的代码:

test_tr5_no_target<- test_tr5[-2]


tr5_train<- test_tr5_no_target[1:74475, , drop = FALSE]

tr5_test<- test_tr5_no_target[74476:93094, , drop = FALSE]

pred_train_labels<- test_tr5[1:74475, 2] 

pred_test_labels<- test_tr5[74476:93094, 2]


#install.packages("class")

library(class)

##ensure all data is a dataframe

as.data.frame(tr5_train)

as.data.frame(tr5_test)

as.data.frame(pred_train_labels)


pred1<- knn(data.frame(tr5_train), data.frame(tr5_test), cl = pred_train_labels, k = 5)

请记住,标签列2是数字目标功能。我已经进行了全面的研究,但未能找到引发此错误的原因,是否有我做错的事情?

感谢所有帮助,非常感谢! (不幸的是,由于受限制,我无法共享数据本身)

-Jose C.

1 个答案:

答案 0 :(得分:1)

要直接回答您的问题:您希望标签(这里是mtcars)是矢量,而不是数据框。我们可以使用library('tidyverse') library('class') set.seed(1) x <- mtcars target <- x[-1] size <- floor(0.75 * nrow(x)) train_ind <- sample(seq_len(nrow(x)), size = size) train <- x[train_ind, ] test <- x[-train_ind, ] label <- as.data.frame(x[1][train_ind, ]) #problem is here test <- knn(train,test,cl = label, k = 5) test Error in knn(train, test, cl = label, k = 5) : 'train' and 'class' have different lengths 数据集来重新创建您的错误。

train_ind <- sample(seq_len(nrow(x)), size = size)

train <- x[train_ind, ]
test <- x[-train_ind, ]

label <- x[1][train_ind, ] #NOT a dataframe

test <- knn(train,test,cl = label, k = 5, prob = TRUE)
attributes(test)

$`levels`
 [1] "10.4" "14.3" "14.7" "15"   "15.2" "15.8" "16.4" "17.3"
 [9] "17.8" "18.7" "19.2" "19.7" "21"   "21.4" "22.8" "24.4"
 [17] "26"   "30.4" "32.4"

通过允许标签成为向量,然后从新的knn对象调用属性,我们可以获得输出:

??knn

浏览ViewController中的示例也显示了这一点。