有效地找到元组列表的重叠

时间:2014-09-10 21:12:32

标签: python performance list tuples

我有一堆列表,每个列表都由元组组成。

A = [(1,2,3),(4,5,7),(8,9,10),(5,6,2)]
B = [(1,3,6),(4,2,8),(3,6,7),(5,2,8)]
C = [(6,2,3),(1,7,2),(5,7,2),(7,2,7)]

我需要找到一组元组,以便元组的第一个元素出现在每个列表中。 (我知道这很令人困惑)对于我上面的例子,重叠将是:

overlap = [(1,2,3),(1,3,6),(1,7,2),(5,6,2),(5,2,8),(5,7,2)]

这是因为一个数字为' 1'元组的第一个元素出现在每个列表中。数字' 5'

也是如此

这样做的最佳方式是什么?

我目前有工作代码,但我觉得有更好的方法可以做到这一点。

big_list = [A,B,C]
overlap = []
all_points = list(set([item for item in big_list]))
for (f,s,t) in all_points:
    in_all = True
    for lest in big_list:
        present = False
        for (first, second, third) in lest:
            if first == f:
                present = True
        if not present:
            in_all = False
    if in_all:
        overlap.append((f,s,t))

3 个答案:

答案 0 :(得分:2)

您可以使用set intersection:

>>> from itertools import chain
>>> def get_first(seq):                                       
        return (x[0] for x in seq)
>>> common = set(get_first(A)).intersection(get_first(B), get_first(C))

现在common包含:

>>> common
set([1, 5])

现在我们可以遍历A,B和C中的各个项目,并选择那些共同找到第一个项目的元组:

>>> [x for x in chain(A, B, C) if x[0] in common]
[(1, 2, 3), (5, 6, 2), (1, 3, 6), (5, 2, 8), (1, 7, 2), (5, 7, 2)]

按第一项排序:

>>> from operator import itemgetter
>>> sorted((x for x in chain(A, B, C) if x[0] in common), key=itemgetter(0))
[(1, 2, 3), (1, 3, 6), (1, 7, 2), (5, 6, 2), (5, 2, 8), (5, 7, 2)]

答案 1 :(得分:1)

由于您只关心第一个元素,因此您不必运行多个循环。改用一套。评论以内联方式提供。

A = [(1,2,3),(4,5,7),(8,9,10),(5,6,2)]
B = [(1,3,6),(4,2,8),(3,6,7),(5,2,8)] 
C = [(6,2,3),(1,7,2),(5,7,2),(7,2,7)]

def tuple_F(n_list):
    # You only care about unique first elements, so using a set will be more efficient.
    return set(nl[0] for n_l in n_list)

set_FA = tuple_F(A)
set_FB = tuple_F(B)
set_FC = tuple_F(C)

# Python makes it ridiculously easy to take intersection of multiple sets at one shot
set_ABC = set.intersection(set_FA, set_FB, set_FC)

# And again, python makes it really easy to merge sets using just a A+B+C
overlap = [tup for tup in A+B+C if tup[0] in set_ABC]

print overlap

这将打印:

[(1, 2, 3), (5, 6, 2), (1, 3, 6), (5, 2, 8), (1, 7, 2), (5, 7, 2)]

希望这有帮助!

答案 2 :(得分:0)

由于你有一个列表列表,我将使用另一级别的列表理解进行一个循环。两级列表理解将难以理解。

first_letters = None
for l in [A, B, C]:
   f = set([ i[0] for i in l ])
   first_letters = first_letters & f if first_letters else f 
overlap = [ i for i in A + B + C if i[0] in first_letters ]

或者,如果您不喜欢显式循环,

def func1(x, y):
    f = set([i[0] for i in y])
    return x & f if x else f
first_letters = reduce(func1, [A, B, C], None)
overlap = [ i for i in A + B + C if i[0] in first_letters ]

如果您知道所有可能的值,则可以进一步简化以下代码:

all_possible = set(range(9))
first_letters = reduce( lambda x, y: x & set([ i[0] for i in y ]), [A, B, C], all_possible )
overlap = [ i for i in A + B + C if i[0] in first_letters ]