Python点产品,存在大尺寸内存问题

时间:2017-04-29 10:12:41

标签: python pca dot-product

我有矩阵X的尺寸(36000,3600) 而我正在尝试计算Sigma = X * X_transpose,作为实施PCA算法的一部分

X是C_CONTIGUOUS

直接尝试并在第二维上循环,两者都以内存问题结束(我的计算机内存不足,运行8GB)

实现这一目标的最佳方法是什么? 附上我的代码

我是python的新手(但不是编程),这是我要做的第一件事,所以欢迎任何提示!

非常感谢

class PCAProjector:

def __init__(self, X):
    d, n = X.shape
    self.X = X
    self.Sigma = np.zeros((d,d), dtype='float32', order='C')
    for i in range(n):
        print(i)
        x_i = np.zeros((d,1), dtype='float32', order='C')
        x_i = x_i + X[:,[i]]
        x_i_transpose = np.zeros((1,d), dtype='float32', order='C')
        x_i_transpose = x_i_transpose + np.transpose(x_i)
        i_result = np.dot(x_i, x_i_transpose)
        self.Sigma = self.Sigma + i_result

    self.Sigma = self.Sigma / n
    self.EigenVectorsSorted = EigenVecValculator().calEigenVectors(self.Sigma)

def projectAllSamples(self, numOfDimensions):
    """
    Projects samples to numOfDimensions dimensions

    Projects samples that were passed in Ctor (self.X)
    :param numOfDimensions: number of dimensions to project on
    :return: matrix of (numOfDimensions)X(numOfSamples) that contains the projected samples
    """
    transformationMatrix = self.EigenVectorsSorted[:, 0:numOfDimensions]
    return np.transpose(transformationMatrix)*self.X

0 个答案:

没有答案