多变量k均值聚类

时间:2014-12-10 17:21:00

标签: r k-means

我试图在r中进行多变量k-means聚类图。我有3个变量,10列数据,加上上下文(如Iris的物种),因此有11个变量。我的x是PeruReady,显然是

按照在线教程我到目前为止:

PeruReady.km <- kmeans(PeruReady[, -1], 3, iter.max=1000) 
tbl <- table(PeruReady[, 1], PeruReady.km$cluster) 
PeruReady.dist <- dist(PeruReady[, -1]) 
PeruReady.mds <- cmdscale(PeruReady.dist) 
c.chars <- c("*", "o", "+")[as.integer(PeruReady$Context)] 
a.cols <- rainbow(3)[PeruReady$cluster] 
plot(PeruReady.mds, col=a.cols, pch=c.chars, xlab="X", ylab="Y")

但我的情节完全是空的,我做错了什么?

1 个答案:

答案 0 :(得分:0)

使用小数据集(demand.sm),您的代码工作得很好。您是否规范了所有数字列?

dput(demand.sm) 

structure(list(Demand = c("rify la", "p quasi", "rify LD", "ventive", 
"ekeeper", " de min", " risk g", " approv", "uest te", "", "al trai", 
"cation", "ely inv", "rge tim", "get of ", "vey pro", "ent ONA", 
"ble sel", "cipline", "tus rep", "ced-ran"), normalized = structure(c(-1.15780226157481, 
-0.319393727330983, -1.15780226157481, -1.15780226157481, -0.319393727330983, 
-0.319393727330983, -0.319393727330983, -0.319393727330983, 0.519014806912847, 
0.519014806912847, 0.519014806912847, -0.738597994452898, -0.738597994452898, 
2.19583187540051, 2.19583187540051, -1.15780226157481, -0.319393727330983, 
-0.319393727330983, 0.519014806912847, 1.35742334115668, 0.519014806912847
), .Dim = c(21L, 1L), "`scaled:center`" = 3.76190476190476, "`scaled:scale`" = 2.38547190100328)), .Names = c("Demand", 
"normalized"), row.names = c(NA, -21L), class = "data.frame")
clusters <- kmeans(demand.sm[ , "normalized"], 5)

demand.dist <- dist(demand.sm[ , "normalized"]) 
demand.mds <- cmdscale(demand.dist) # multidimensional scaling of data matrix, aka principal coordinates analysis
c.chars <- c("*", "o", "+")[as.integer(clusters$Context)] 
a.cols <- rainbow(3)[clusters$cluster] 
plot(demand.mds, col=a.cols, pch=c.chars, xlab="X", ylab="Y")

enter image description here