Question

说我有这种类型的数组

y 

array([299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419, 667137, 299800])

作为＆＃34;前10名＆＃34;的结果argpartition：

y = np.argpartiton(-x, np.arange(10))[:10]

现在，我想删除顺序元素，只保留系列中的第一个（最大）元素，以便：

y_new
array([299839, 667136, 665420, 299799])

但是虽然看起来应该很简单但我没有看到一种有效的方法（或者甚至是一种好的开始方式）。假设真实世界的应用程序将执行前1000名，并需要多次执行。

Answer 1

这是一种基于排序的方法 -

# Get the sorted indices
sidx = y.argsort()

# Get sorted array
ys = y[sidx]

# Get indices at which islands of sequential numbers start/stop
cut_idx = np.flatnonzero(np.concatenate(([True], np.diff(ys)!=1 )))

# Finally get the minimum indices for each island and then index into
# input for the desired output
y_new = y[np.minimum.reduceat(sidx, cut_idx)]

如果您想保持输出中元素的顺序，请对索引进行排序，然后在最后一步进行索引 -

y[np.sort(np.minimum.reduceat(sidx, cut_idx))]

示例输入，输出 -

In [56]: y
Out[56]: 
array([299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419,
       667137, 299800])

In [57]: y_new
Out[57]: array([299799, 299839, 665420, 667136])

In [58]: y[np.sort(np.minimum.reduceat(sidx, cut_idx))]
Out[58]: array([299839, 667136, 665420, 299799])

Answer 2

继承我对该问题的实施

from itertools import groupby
from operator import itemgetter


a = [299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419, 
667137, 299800]
new = a[:]

# to keep the first number
b = a[0]
new.sort()
# to store diffrent arrays
saver = []
final_array = []

for k, g in groupby(enumerate(new), lambda (i, x): i - x):
    ac = map(itemgetter(1), g)
    saver.append(ac)

final_array.append(b)
for i in range(len(saver)):
    for j in range(len(a)):
        if a[j] in saver[i]:
            if b == a[j]:
                continue
            final_array.append(a[j])
            break


print final_array

输出

[299839, 299799, 665420, 667136]

保留作为序列一部分的数组的第一个元素

2 个答案: