Question

我目前正在尝试在R中做PCA。这是我在数据挖掘中的第一个项目。我有大约200个功能和大约3000行数据。

数据是不是的标准化形式，我需要减少维数所以我正在使用PCA。这就是我现在所做的事情

x <- princomp(data,scores=TRUE,cor=TRUE)

我想减少尺寸，我应该看看得分值。所以我做了最重要的几个值

head(x$scores)

这是输出

       Comp.1     Comp.2     Comp.3     Comp.4    ...
[1,]  6.831452 -4.4316218 -1.9226226 -0.8344245 
[2,] -1.808007 -4.2743390  1.0173944  0.4527465
[3,] -7.750329 -4.9523056 -1.6750438  1.6247354 
.
.
.

现在我不确定如何解释这些矩阵并获得最佳属性（并进行降维）。如果有人能帮我解决这个问题会很棒。

P.S - 我搜索了很多，但没有得到同样的答案。

Answer 1

scores只是其中的一部分。通用公式为：

original_data =~ approximation = (scores * loadings) * scale + center

其中：

1. `scores` are the coordinates in your new orthogonal base
1. `loadings` are the directions of the new axis in the old base
1. `scale` are the scaling applied to the dimensions
1. `center` are the coordinates of the new base origin in the old base

使用R对象，上面的公式是

data =~ t(t(x$scores %*% t(x$loadings)) * x$scale + x$center)

您只想通过第一次i加载来缩小维度：

data =~ t(t(x$scores[, 1:i] %*% t(x$loadings[, 1:i ])) * x$scale + x$center)

解释princomp结果

1 个答案: