按短号分割列表

时间:2015-05-14 15:22:01

标签: python python-3.x numpy split

我正在使用NumPy在图表上查找交叉点,但isClose每个交叉点返回多个值

所以,我将尝试找到他们的平均值。但首先,我想隔离相似的值。这也是我觉得有用的技巧。

我有一个名为idx的交叉点的x值列表,如下所示

[-8.67735471 -8.63727455 -8.59719439 -5.5511022  -5.51102204 -5.47094188
 -5.43086172 -2.4248497  -2.38476954 -2.34468938 -2.30460922  0.74148297
  0.78156313  0.82164329  3.86773547  3.90781563  3.94789579  3.98797595
  7.03406814  7.0741483   7.11422846]

我希望将其分成每个由相似数字组成的列表。

这是我到目前为止所做的:

n = 0
for i in range(len(idx)):
    try:
        if (idx[n]-idx[n-1])<0.5:
            sdx.append(idx[n-1])
        else:
            print(sdx)
            sdx = []
    except:
        sdx.append(idx[n-1])
    n = n+1

它在很大程度上起作用,但它会忘记一些数字:

[-8.6773547094188377, -8.6372745490981959]
[-5.5511022044088181, -5.5110220440881763, -5.4709418837675354]
[-2.4248496993987976, -2.3847695390781567, -2.3446893787575149]
[0.7414829659318638, 0.78156312625250379]
[3.8677354709418825, 3.9078156312625243, 3.9478957915831661]

Theres可能是一种更有效的方法,有人知道吗?

2 个答案:

答案 0 :(得分:7)

考虑到你有一个numpy数组,你可以使用np.split,分割差异所在的位置&gt; C:\ProgramData\Oracle\Java\javapath; C:\Program Files (x86)\Intel\iCLS Client\; C:\Program Files\Intel\iCLS Client\; C:\windows\system32; C:\windows; C:\windows\System32\Wbem; C:\windows\System32\WindowsPowerShell\v1.0\; C:\Program Files\Intel\Intel(R) Management Engine Components\DAL; C:\Program Files\Intel\Intel(R) Management Engine Components\IPT; C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL; C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT; C:\Program Files\MiKTeX 2.9\miktex\bin\x64\; C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\; C:\Program Files (x86)\Skype\Phone\; C:\Program Files (x86)\Java\jre7\bin;C:\Program Files\Java\jdk1.6.0\bin;

.5

import numpy as np x = np.array([-8.67735471, -8.63727455, -8.59719439, -5.5511022, -5.51102204, -5.47094188, -5.43086172, -2.4248497, -2.38476954, -2.34468938, -2.30460922, 0.74148297, 0.78156313, 0.82164329, 3.86773547, 3.90781563, 3.94789579, 3.98797595, 7.03406814, 7.0741483]) print np.split(x, np.where(np.diff(x) > .5)[0] + 1) [array([-8.67735471, -8.63727455, -8.59719439]), array([-5.5511022 , -5.51102204, -5.47094188, -5.43086172]), array([-2.4248497 , -2.38476954, -2.34468938, -2.30460922]), array([ 0.74148297, 0.78156313, 0.82164329]), array([ 3.86773547, 3.90781563, 3.94789579, 3.98797595]), array([ 7.03406814, 7.0741483 ])] 返回以下元素不符合np.where(np.diff(x) > .5)[0]条件的索引:

np.diff(x) > .5)

In [6]: np.where(np.diff(x) > .5)[0] Out[6]: array([ 2, 6, 10, 13, 17]) 为每个索引添加1:

+ 1

然后将In [12]: np.where(np.diff(x) > .5)[0] + 1 Out[12]: array([ 3, 7, 11, 14, 18]) 传递给np.split将元素拆分为子数组[ 3, 7, 11, 14, 18]

答案 1 :(得分:0)

如果您的最终目的地正在查找每个群集/群组的平均值,其中每个群集的标记差异不会超过某个阈值,您可以使用下面列出的方法。

基本上,我们将输入列表转换为numpy数组,对其进行排序,然后找出连续的差异。根据与某个阈值进行比较时的差异,我们为同一组中的元素创建一个ID相同的ID数组。最后,使用这些ID,我们使用np.bincount在区间内进行分箱和平均,基本上得到每组的平均值。

这是实施 -

import numpy as np

# Input list
AList = [-8.67735471, -8.63727455, -8.59719439, -5.5511022,  -5.51102204,
         -5.47094188, -5.43086172, -2.4248497,  -2.38476954, -2.34468938,
         -2.30460922,  0.74148297,  0.78156313,  0.82164329,  3.86773547,
    3.90781563, 3.94789579,  3.98797595,  7.03406814,  7.0741483, 7.11422846]

# Tolerance as thresholding parameter to distinguish between two "groups"
tolerance = 1

# Convert to a numpy array and sort if not already sorted
A = np.sort(np.asarray(AList))

# ID array that has the same IDs for elements of the same group
ID_array = (np.append([False],np.diff(A)>tolerance)).cumsum()

# Finally get the average values for each group    
average_values = np.bincount(ID_array,A)/np.bincount(ID_array)

示例运行 -

In [301]: A
Out[301]: 
array([-8.67735471, -8.63727455, -8.59719439, -5.5511022 , -5.51102204,
       -5.47094188, -5.43086172, -2.4248497 , -2.38476954, -2.34468938,
       -2.30460922,  0.74148297,  0.78156313,  0.82164329,  3.86773547,
        3.90781563,  3.94789579,  3.98797595,  7.03406814,  7.0741483 ,
        7.11422846])

In [302]: average_values
Out[302]: 
array([-8.63727455, -5.49098196, -2.36472946,  0.78156313,  3.92785571,
        7.0741483 ])