如何在numpy内部实现协方差?

时间:2013-10-31 08:01:03

标签: python numpy matrix covariance

这是协方差矩阵的定义。 http://en.wikipedia.org/wiki/Covariance_matrix#Definition

矩阵中的每个元素(主要对角线除外)(如果我没有错)简化为E(x_ {i} * x_ {j}) - mean(i)* mean(j)其中i和j是协方差矩阵的行号和列号。

来自numpy文档,

x = np.array([[0, 2], [1, 1], [2, 0]]).T
x
array([[0, 1, 2], [2, 1, 0]])    
np.cov(x)
array([[ 1., -1.],
   [-1.,  1.]])

第一行,即[0,1,2]对应于X_ {0} 第二行,即[2,1,0]对应X_ {1} 如何计算X_ {0} * X_ {1}的期望值,因为随机变量的分布是未知的?

感谢。

1 个答案:

答案 0 :(得分:4)

只需检查代码。
cov

中的\site-packages\numpy\lib\function_base.py
def cov(m, y=None, rowvar=1, bias=0, ddof=None):
    """
    Estimate a covariance matrix, given data.

    Covariance indicates the level to which two variables vary together.
    If we examine N-dimensional samples, :math:`X = [x_1, x_2, ... x_N]^T`,
    then the covariance matrix element :math:`C_{ij}` is the covariance of
    :math:`x_i` and :math:`x_j`. The element :math:`C_{ii}` is the variance
    of :math:`x_i`.

    Parameters
    ----------
    m : array_like
        A 1-D or 2-D array containing multiple variables and observations.
        Each row of `m` represents a variable, and each column a single
        observation of all those variables. Also see `rowvar` below.

...

    if y is not None:
        y = array(y, copy=False, ndmin=2, dtype=float)
        X = concatenate((X,y), axis)

    X -= X.mean(axis=1-axis)[tup]
    if rowvar:
        N = X.shape[1]
    else:
        N = X.shape[0]

    if ddof is None:
        if bias == 0:
            ddof = 1
        else:
            ddof = 0
    fact = float(N - ddof)

    if not rowvar:
        return (dot(X.T, X.conj()) / fact).squeeze()
    else:
        return (dot(X, X.T.conj()) / fact).squeeze()