我有矩阵X的尺寸(36000,3600) 而我正在尝试计算Sigma = X * X_transpose,作为实施PCA算法的一部分
X是C_CONTIGUOUS
直接尝试并在第二维上循环,两者都以内存问题结束(我的计算机内存不足,运行8GB)
实现这一目标的最佳方法是什么? 附上我的代码
我是python的新手(但不是编程),这是我要做的第一件事,所以欢迎任何提示!
非常感谢
class PCAProjector:
def __init__(self, X):
d, n = X.shape
self.X = X
self.Sigma = np.zeros((d,d), dtype='float32', order='C')
for i in range(n):
print(i)
x_i = np.zeros((d,1), dtype='float32', order='C')
x_i = x_i + X[:,[i]]
x_i_transpose = np.zeros((1,d), dtype='float32', order='C')
x_i_transpose = x_i_transpose + np.transpose(x_i)
i_result = np.dot(x_i, x_i_transpose)
self.Sigma = self.Sigma + i_result
self.Sigma = self.Sigma / n
self.EigenVectorsSorted = EigenVecValculator().calEigenVectors(self.Sigma)
def projectAllSamples(self, numOfDimensions):
"""
Projects samples to numOfDimensions dimensions
Projects samples that were passed in Ctor (self.X)
:param numOfDimensions: number of dimensions to project on
:return: matrix of (numOfDimensions)X(numOfSamples) that contains the projected samples
"""
transformationMatrix = self.EigenVectorsSorted[:, 0:numOfDimensions]
return np.transpose(transformationMatrix)*self.X