将n维数组归约为带有附加列的二维数组

时间:2018-07-18 17:39:43

标签: python numpy

我在numpy中有一个n-dim数组,并且我有n个列向量。 我需要将n-dim数组转换为具有

的2-D numpy数组

rows = size of n-dim array

cols = n + 1

为简单起见,

a = np.random.randint(50, size=(2,2))
r = np.array([0.2,1.9])
c = np.array([4,5])
a =>
array([[45, 18], [ 4, 24]])
c => array([4, 5])
r => array([ 0.2,  1.9])

我需要将其转换为以下内容,

array([[ 45. ,   4. ,   0.2],
   [ 18. ,   5. ,   0.2],
   [  4. ,   4. ,   1.9],
   [ 24. ,   5. ,   1.9]])

我将其编写如下,尽管我觉得这不是最好的解决方案, 但这确实有效,并且对于相对较大的值似乎足够快,

def get_2d_array(  arr, r, c):
    w = None
    for i in range(arr.shape[0]):
        rv = np.full((arr[i].shape[0], 1), r[i])
        z = np.concatenate((arr[i].reshape(-1, 1), c.reshape(-1, 1), rv), axis=1)
        if w is None:
            w = z
        else:
            w = np.concatenate((w, z))
    return w

是否还有其他方法可以在numpy中执行此操作而没有循环?

此外,概括地说,我实际上有一个4-D数组,我需要使用与上述类似的结构将其缩减为2D数组。我无法使用递归函数,最终不得不显式减少第4和第3暗度,如下所示,

    def reduce_3d(self, arr3, row, col, third_dim_array):
    x = None
    for i in range(arr3.shape[0]):
        x1 = self.reduce_2d(arr3[i], row, col)
        third_array = np.full((x1.shape[0], 1), third_dim_array[i])
        x1 = np.concatenate((x1, third_array), axis=1)
        if x is None:
            x = x1
        else:
            x = np.concatenate((x, x1))
    return x

    def reduce_4d(air_temp ,row, col, third, second):
    w = None
    for j in range(air_temp.shape[0]):
        w1 = self.reduce_3d(air_temp[j], row, col, third)
        second_arr = np.full((w1.shape[0], 1), second[j])
        w1 = np.concatenate((w1, second_arr), axis=1)
        if w is None:
            w = w1
        else:
            w = np.concatenate((w, w1))
    return w

4-D示例的输出如下,

a = np.random.randint(100, size=(2,3,2,2))
array([[[[ 8, 38],
     [89, 95]],
    [[63, 82],
     [24, 27]],
    [[22, 18],
     [25, 30]]],
   [[[94, 21],
     [83,  9]],
    [[25, 98],
     [84, 57]],
    [[89, 20],
     [40, 60]]]])

r   Out[371]: array([ 0.2,  1.9])
c   Out[372]: array([4, 5])
third array([ 50, 100, 150])
second array([[datetime.date(2009, 1, 1)],
   [datetime.date(2010, 5, 4)]], dtype=object)

z = reduce_4d(a,r,c,third,second)
z

array([[8.0, 4.0, 0.2, 50.0, datetime.date(2009, 1, 1)],
   [38.0, 5.0, 0.2, 50.0, datetime.date(2009, 1, 1)],
   [89.0, 4.0, 1.9, 50.0, datetime.date(2009, 1, 1)],
   [95.0, 5.0, 1.9, 50.0, datetime.date(2009, 1, 1)],
   [63.0, 4.0, 0.2, 100.0, datetime.date(2009, 1, 1)],
   [82.0, 5.0, 0.2, 100.0, datetime.date(2009, 1, 1)],
   [24.0, 4.0, 1.9, 100.0, datetime.date(2009, 1, 1)],
   [27.0, 5.0, 1.9, 100.0, datetime.date(2009, 1, 1)],
   [22.0, 4.0, 0.2, 150.0, datetime.date(2009, 1, 1)],
   [18.0, 5.0, 0.2, 150.0, datetime.date(2009, 1, 1)],
   [25.0, 4.0, 1.9, 150.0, datetime.date(2009, 1, 1)],
   [30.0, 5.0, 1.9, 150.0, datetime.date(2009, 1, 1)],
   [94.0, 4.0, 0.2, 50.0, datetime.date(2010, 5, 4)],
   [21.0, 5.0, 0.2, 50.0, datetime.date(2010, 5, 4)],
   [83.0, 4.0, 1.9, 50.0, datetime.date(2010, 5, 4)],
   [9.0, 5.0, 1.9, 50.0, datetime.date(2010, 5, 4)],
   [25.0, 4.0, 0.2, 100.0, datetime.date(2010, 5, 4)],
   [98.0, 5.0, 0.2, 100.0, datetime.date(2010, 5, 4)],
   [84.0, 4.0, 1.9, 100.0, datetime.date(2010, 5, 4)],
   [57.0, 5.0, 1.9, 100.0, datetime.date(2010, 5, 4)],
   [89.0, 4.0, 0.2, 150.0, datetime.date(2010, 5, 4)],
   [20.0, 5.0, 0.2, 150.0, datetime.date(2010, 5, 4)],
   [40.0, 4.0, 1.9, 150.0, datetime.date(2010, 5, 4)],
   [60.0, 5.0, 1.9, 150.0, datetime.date(2010, 5, 4)]], dtype=object)

z.shape ==> (24L, 5L)
z.size => 120
a.size ==> 24

z.shape[0] == a.size
a.shape[1] == a.ndim + 1

是否有更好,更有效的方法来做到这一点?

非常感谢

2 个答案:

答案 0 :(得分:1)

这是一个使用np.meshgrid创建列组合并使用np.vstack将其堆叠在一起的解决方案:

In [101]: a = np.array([[45, 18], [ 4, 24]])

In [102]: col_vecs = [np.array([4, 5]), np.array([0.2, 1.9])]

In [103]: np.vstack([np.ravel(a)] + [c.ravel() for c in np.meshgrid(*col_vecs)]).T
Out[103]: 
array([[45. ,  4. ,  0.2],
       [18. ,  5. ,  0.2],
       [ 4. ,  4. ,  1.9],
       [24. ,  5. ,  1.9]])

这同样适用于更大的尺寸

答案 1 :(得分:0)

我偶然发现了另一种不太复杂的方法,只是在这里提到它, 请更正或添加其他选项...谢谢

def reduce( a, dims):
    """
    iterate over the dimensions of the array and 
     progressive build the columns through a combination of
    `tile` and `repeat`
    :param a: the input array of multi-dimensions
    :param dims: an array of feature vectors of size (n,) 
     in order of last one first.
    i.e. the first element of this array is an np array that matches or 
    corresponds to the last dimension in a
    :return: 
    """
    item_count = a.size
    m_all = a.reshape((-1, 1))
    repeat_cnt = 1
    level = 0
    for i in range(a.ndim):
        if level == 0:
            repeat_cnt = 1
            level = -1
        else:
            repeat_cnt = a.shape[level] * repeat_cnt
            level = level - 1
        cur_array = dims[i]
        tile_cnt = item_count / (cur_array.size * repeat_cnt)
        cur_col = np.tile(np.repeat(cur_array, repeat_cnt), tile_cnt).reshape((-1, 1))
        m_all = np.concatenate((m_all, cur_col), axis=1)
    return m_all
相关问题