按多个标签求和2D Numpy数组

时间:2013-06-13 17:41:51

标签: python arrays numpy sum 2d

感谢您回答我的问题。这是我的第三个。

  1. 数据数组的每个元素都是坐标(x,y)。
  2. 每个坐标都有2个标签
  3. 目标:对具有相同两个标签的元素求和。
  4. 例如,如果输入是

    data = numpy.array( [ [1, 2], [3,8], [4,5], [2,9], [1, 3], [7, 2] ] )
    label1 = numpy.array([0,0,1,1,2,2])
    label2 = numpy.array([0,1,0,0,1,1])
    

    应该给:

    array([[[ 1 ,  2 ],
            [ 3 ,  8 ]],
    
           [[ 6 , 14 ],
            [ 0 ,  0 ]],
    
           [[ 0 ,  0 ],
            [ 8 ,  5 ]]])
    

    这是我目前的代码:

    import numpy
    import ndimage from scipy
    
    data = numpy.array( [ [1, 2], [3,8], [4,5], [2,9], [1, 3], [7, 2] ] )
    label1 = numpy.array([0,0,1,1,2,2])
    label2 = numpy.array([0,1,0,0,1,1])
    
    kinds_of_label1 = 3
    kinds_of_label2 = 2
    
    label1_l = label1.size
    label2_l = label2.size
    
    label12 = label1 * 2 + label2
    kinds12_range = range(kinds_of_label1 * kinds_of_label2 )
    
    result = numpy.zeros( (num_frame, num_cluster, 2) )
    result_T = result.view().reshape( (num_frame * num_cluster, 2) ).T
    result_T[0] = ndimage.measurements.sum( data.T[0], label12, index = kinds12_range )
    result_T[1] = ndimage.measurements.sum( data.T[1], label12, index = kinds12_range )
    counting = numpy.bincount(label12)
    
    print(result)
    print(counting)
    

    这是有效的,但是分别对x和y坐标求和(如result_T [0]和result_T [1]中)看起来很糟糕。而且,ndimage.measurements.sum给出浮点答案。整数算术更快。

    我们可以让这更快更好吗?

1 个答案:

答案 0 :(得分:0)

#### Wrong Answer.  Do Not Use. ####
import numpy
### Input ###
label1 = numpy.array([0,0,1,1,2,2])
kinds_of_label1 = 3

label2 = numpy.array([0,1,0,0,1,1])
kinds_of_label2 = 2

data = numpy.array( [ [1, 2], [3,8], [4,5], [2,9], [1, 3], [7, 2] ] )

### Processing ####
# this assumes label1 and label2 starts are like 0, 1, 2, 3 ...
label1_and_2 = label1*kinds_of_label2 + label2

result = numpy.zeros( (kinds_of_label1 * kinds_of_label2, 2) )
result[ label1_and_2 ] += data

counting = numpy.bincount( label1_and_2 )

### output ###
print( result.view().reshape(kinds_of_label1, kinds_of_label2, 2) )


>>> array([[[ 1.,  2.],
            [ 3.,  8.]],

           [[ 2.,  9.],
            [ 0.,  0.]],

           [[ 0.,  0.],
            [ 7.,  2.]]])