说我有这种类型的数组
y
array([299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419, 667137, 299800])
作为"前10名"的结果argpartition
:
y = np.argpartiton(-x, np.arange(10))[:10]
现在,我想删除顺序元素,只保留系列中的第一个(最大)元素,以便:
y_new
array([299839, 667136, 665420, 299799])
但是虽然看起来应该很简单但我没有看到一种有效的方法(或者甚至是一种好的开始方式)。假设真实世界的应用程序将执行前1000名,并需要多次执行。
答案 0 :(得分:2)
这是一种基于排序的方法 -
# Get the sorted indices
sidx = y.argsort()
# Get sorted array
ys = y[sidx]
# Get indices at which islands of sequential numbers start/stop
cut_idx = np.flatnonzero(np.concatenate(([True], np.diff(ys)!=1 )))
# Finally get the minimum indices for each island and then index into
# input for the desired output
y_new = y[np.minimum.reduceat(sidx, cut_idx)]
如果您想保持输出中元素的顺序,请对索引进行排序,然后在最后一步进行索引 -
y[np.sort(np.minimum.reduceat(sidx, cut_idx))]
示例输入,输出 -
In [56]: y
Out[56]:
array([299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419,
667137, 299800])
In [57]: y_new
Out[57]: array([299799, 299839, 665420, 667136])
In [58]: y[np.sort(np.minimum.reduceat(sidx, cut_idx))]
Out[58]: array([299839, 667136, 665420, 299799])
答案 1 :(得分:0)
继承我对该问题的实施
from itertools import groupby
from operator import itemgetter
a = [299839, 667136, 665420, 665418, 665421, 667135, 299799, 665419,
667137, 299800]
new = a[:]
# to keep the first number
b = a[0]
new.sort()
# to store diffrent arrays
saver = []
final_array = []
for k, g in groupby(enumerate(new), lambda (i, x): i - x):
ac = map(itemgetter(1), g)
saver.append(ac)
final_array.append(b)
for i in range(len(saver)):
for j in range(len(a)):
if a[j] in saver[i]:
if b == a[j]:
continue
final_array.append(a[j])
break
print final_array
输出
[299839, 299799, 665420, 667136]