numpy中多次引用的布尔切片

时间:2017-07-17 13:49:46

标签: python numpy

我想根据一些布尔条件(级联,一个接一个)更改多维numpy数组(比如说mydata)。

这有效:

mydata[condition] = something

这不是:

mydata[condition1][condition2] = something

其中所有条件都是兼容形状的布尔数组(brodcast-able)。 任何理由都没有,什么可能是一个好的解决方案?现在,我通过以下方式重新分配给原文来解决它:

tempdata = mydata[condition1]
tempdata[condition2] = something
mydata[condition1] = tempdata

1 个答案:

答案 0 :(得分:2)

要解决此类案例,请使用 chained / cascaded integer-indexing -

idx1 = np.flatnonzero(condition1)
idx2 = np.flatnonzero(condition2)
mydata[idx1[idx2]] =  something

示例运行 -

In [42]: mydata = np.array([2,6,8,0,9,3,1,4])
    ...: mydata_copy = mydata.copy() # make copy for verification
    ...: condition1 = np.array([True,False,True,True,True,False,False,True])
    ...: condition2 = np.array([False,True,False,True,True])
    ...: something = -1
    ...: 

# Working solution from question    
In [43]: tempdata = mydata[condition1]
    ...: tempdata[condition2] = something
    ...: mydata[condition1] = tempdata
    ...: 

In [44]: mydata  # Check changed values
Out[44]: array([ 2,  6, -1,  0, -1,  3,  1, -1])

# Proposed solution
In [45]: idx1 = np.flatnonzero(condition1)
    ...: idx2 = np.flatnonzero(condition2)
    ...: mydata_copy[idx1[idx2]] =  something
    ...: 

In [46]: mydata_copy  # Verify changed values in copy
Out[46]: array([ 2,  6, -1,  0, -1,  3,  1, -1])

替代方法:或者,如果您不介意编辑condition1,可以这样做 -

condition1[idx1] = condition2

然后使用mydata[condition1] = something作为最后一步。

效益

让我们给出建议的时间,看看问题中是否有任何好处。

方法 -

# Original approach
def org_app(mydata,condition1,condition2):
    tempdata = mydata[condition1]
    tempdata[condition2] = something
    mydata[condition1] = tempdata
    return mydata

# Proposed one
def proposed_app(mydata,condition1,condition2):
    idx1 = np.flatnonzero(condition1)
    idx2 = np.flatnonzero(condition2)
    mydata[idx1[idx2]] =  something
    return mydata

计时 -

In [58]: mydata = np.random.rand(1000000)
    ...: mydata_copy = mydata.copy()
    ...: condition1 = np.random.rand(mydata.size)>0.5
    ...: condition2 = np.random.rand(condition1.sum())>0.5
    ...: something = -1
    ...: 

In [59]: %timeit org_app(mydata,condition1,condition2)
100 loops, best of 3: 14.1 ms per loop

In [61]: %timeit proposed_app(mydata_copy,condition1,condition2)
100 loops, best of 3: 7.44 ms per loop

合并Alternative method应该会带来进一步的性能提升。

相关问题