Question

我正在尝试创建一个类似于此的图：

enter image description here

这里有三个聚类，所有数据点（圆圈）都是根据它们与质心的欧氏距离绘制的。使用此图像可以很容易地看到来自第2类的5个样本最终出现在错误的聚类中。

我正在使用kmeans运行k-means，并且无法弄清楚如何绘制此类图表。

出于示例目的，我们可以使用虹膜数据集。

> iri <- iris
> cl <- kmeans (iri[, 1:4], 3)
> cl
K-means clustering with 3 clusters of sizes 38, 62, 50

Cluster means:
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1     6.850000    3.073684     5.742105    2.071053
2     5.901613    2.748387     4.393548    1.433871
3     5.006000    3.428000     1.462000    0.246000

Clustering vector:
  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 [40] 3 3 3 3 3 3 3 3 3 3 3 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1
 [79] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 1 1 1 2 1 1 1 1 1 1 2 2 1 1
[118] 1 1 2 1 2 1 2 1 1 2 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 2 1 1 1 2 1 1 2

Within cluster sum of squares by cluster:
[1] 23.87947 39.82097 15.15100
 (between_SS / total_SS =  88.4 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
[6] "betweenss"    "size"         "iter"         "ifault"

图片来源：https://github.com/michaelwsherman/winecluster创作者似乎并没有使用kmeans。

我怀疑使用kmeans可能无法做到这一点，因为它不能提供距质心的距离。有没有其他方式以这种方式或类似的方式显示数据？

使用k-means绘制聚类，距离质心

0 个答案: