在2D数组中搜索重复值

时间:2013-04-29 01:28:23

标签: python arrays search area

我正在寻找一种方法来搜索2D数组中的重复部分。

以下面的数组为例:

 1    2    3    4    5
 6    7    8    9   10
11   12   13   14   15
16   17   18   19   20
21   22   23   24   25
26    *8    9*   29   30
31   *13   14   15*   35
17   *18   19*   39   40
41   *23   24*   44   45
46   47   48   49   50

是否有任何方法可以自动搜索重复区域并保存坐标?

4 个答案:

答案 0 :(得分:1)

>>> l=[[1,    2,    3,    4,    5],
... [6,    7,    8,    9,   10],
... [11,   12,   13,   14,   15],
... [16,   17,   18,   19,   20],
... [21,   22,   23,   24,   25],
... [26,    8,    9,   29,   30],
... [31,   13,   14,   15,   35],
... [17,   18,   19,   39,   40],
... [41,   23,   24,   44,   45],
... [46,   47,   48,   49,   50]]
>>> seen = set()
>>> dupes = {}
>>> for i_index, i in enumerate(l):
...     for j_index, j in enumerate(i):
...         if j in seen:
...             dupes[(i_index, j_index)] = j
...         seen.add(j)
...
>>> for coord, num in dupes.iteritems():
...     print "%s: %s" % (coord, num)
...
(7, 0): 17
(8, 2): 24
(7, 1): 18
(8, 1): 23
(6, 1): 13
(6, 3): 15
(6, 2): 14
(5, 1): 8
(5, 2): 9
(7, 2): 19

答案 1 :(得分:0)

保留所有先前条目的collections.counter。在迭代数组时,检查计数器类中是否已存在每个元素,如果存在,则将坐标附加到列表中,然后继续。如果没有,请在该特定数字上递增计数器。

答案 2 :(得分:0)

使用dict,其中key是数字,并将其坐标存储在列表中。

In [171]: lis
Out[171]: 
[[1, 2, 3, 4, 5],
 [6, 7, 8, 9, 10],
 [11, 12, 13, 14, 15],
 [16, 17, 18, 19, 20],
 [21, 22, 23, 24, 25],
 [26, 8, 9, 29, 30],
 [31, 13, 14, 15, 35],
 [17, 18, 19, 39, 40],
 [41, 23, 24, 44, 45],
 [46, 47, 48, 49, 50]]

In [172]: from collections import defaultdict

In [173]: dic=defaultdict(list)

In [174]: for i,x in enumerate(lis):
    for j,y in enumerate(x):
        dic[y].append((i,j))
   .....:         

In [175]: for num,coords in dic.items():
    if len(coords)>1:
        print "{0} was repeated at coordinates {1}".format(num,
                                             " ".join(str(x) for x in coords))
   .....:         
8 was repeated at coordinates (1, 2) (5, 1)
9 was repeated at coordinates (1, 3) (5, 2)
13 was repeated at coordinates (2, 2) (6, 1)
14 was repeated at coordinates (2, 3) (6, 2)
15 was repeated at coordinates (2, 4) (6, 3)
17 was repeated at coordinates (3, 1) (7, 0)
18 was repeated at coordinates (3, 2) (7, 1)
19 was repeated at coordinates (3, 3) (7, 2)
23 was repeated at coordinates (4, 2) (8, 1)
24 was repeated at coordinates (4, 3) (8, 2)

答案 3 :(得分:0)

如果我正确理解您的问题,它不仅要查找单个重复值,还要查找任何一系列值。即[1,2,3,4][2,3,4]会发现[39,87,2,3,4]的副本。

导入和测试值

import itertools,pprint
from collections import defaultdict
l = ((1, 2, 3, 4, 5),
 (6, 7, 8, 9, 10),
 (11, 12, 13, 14, 15),
 (16, 17, 18, 19, 20),
 (21, 22, 23, 24, 25),
 (26, 8, 9, 29, 30),
 (31, 13, 14, 15, 35),
 (17, 18, 19, 39, 40),
 (41, 23, 24, 44, 45),
 (46, 47, 48, 49, 50))

主要代码:

seen = defaultdict(dict)
for y,row in enumerate(l):
        rowlen = len(row)
        values = [ [ (row[i:k+1]) for (i,k) in zip(range(rowlen),range(e,rowlen,1))] for e in range(rowlen) ]
        for valueGroup in values:
            for x,value in enumerate(valueGroup):
                seen[value]['count'] = seen[value].get('count',0) + 1
                seen[value]['x-coOrd'] = x
                seen[("R",y)][value] = True

for y in range(len(l)):
    my_rows_vals = seen[("R",y)].keys()
    for value in my_rows_vals:
        if seen[value]['count'] > 1:
            print "{0} repeated at ({1},{2})".format(value,seen[value]['x-coOrd'],y)

将输出,作为样本(有更多输出):

(13, 14) repeated at (1,6)
(14, 15) repeated at (2,6)
(13,) repeated at (1,6)
(13, 14, 15) repeated at (1,6)
(14,) repeated at (2,6)
(17, 18) repeated at (0,7)
(18, 19) repeated at (1,7)
(17,) repeated at (0,7)
(18,) repeated at (1,7)
(19,) repeated at (2,7)
(17, 18, 19) repeated at (0,7)
(23,) repeated at (1,8)
(24,) repeated at (2,8)
(23, 24) repeated at (1,8)

列表推导逻辑是基于这个例子推理的

 l = [1,2,3,4]
 len = 4
 i:k
 0:1 1:2 2:3 3:4  i = 0,1,2,len-e  k = e,e+1,e+2,e+3    e = 0
 0:2 1:3 2:4      i = 0,1,len-e    k = e,e+1,e+2        e = 1
 0:3 1:4          i = 0,len-e      k = e,e+1            e = 2
 0:4              i = len-e        k = e                e = 3

此方法与其他答案不同,因为它检查数字的个别和序列,并突出显示匹配中涉及的各方。