从python中的元组或集合列表中查找不相交集的集合

时间:2014-11-19 12:25:52

标签: python set intersection

这是问题:我有一个元组列表(如果需要也可以设置)。例如:

a = [(1, 5), (4, 2), (4, 3), (5, 4), (6, 3), (7, 6)]

我想找的是一个列表

r = [(1, 5, 4, 2, 3, 6, 7)]

因为所有集合放在一起后交叉点不为空。

例如

a = [(1, 5), (4, 2), (4, 3), (5, 4), (6, 3), (7, 6), (8, 9)]

结果应该是

r = [(1, 5, 4, 2, 3, 6, 7), (8, 9)]

希望问题很清楚。那么在python中最优雅的方法是什么呢?

干杯

3 个答案:

答案 0 :(得分:4)

这些是图表的连接组件,使用networkx很容易找到。对于你的第二个例子:

>>> edges = [(1, 5), (4, 2), (4, 3), (5, 4), (6, 3), (7, 6), (8, 9)]
>>> graph = nx.Graph(edges) 
>>> [tuple(c) for c in nx.connected_components(graph)]
[(1, 2, 3, 4, 5, 6, 7), (8, 9)]

答案 1 :(得分:1)

看看这个实现,它很快,因为它使用了带路径压缩的Disjoint set,查找和合并操作都是log(n):

class DisjointSet(object):

    def __init__(self,size=None):
        if size is None:
            self.leader = {}  # maps a member to the group's leader
            self.group = {}  # maps a group leader to the group (which is a set)
            self.oldgroup = {}
            self.oldleader = {}
        else:
            self.group = { i:set([i]) for i in range(0,size) }
            self.leader = { i:i for i in range(0,size) }
            self.oldgroup = { i:set([i]) for i in range(0,size) }
            self.oldleader = { i:i for i in range(0,size) }                

    def add(self, a, b):
        self.oldgroup = self.group.copy()
        self.oldleader = self.leader.copy()
        leadera = self.leader.get(a)
        leaderb = self.leader.get(b)
        if leadera is not None:
            if leaderb is not None:
                if leadera == leaderb:
                    return  # nothing to do
                groupa = self.group[leadera]
                groupb = self.group[leaderb]
                if len(groupa) < len(groupb):
                    a, leadera, groupa, b, leaderb, groupb = b, leaderb, groupb, a, leadera, groupa
                groupa |= groupb
                del self.group[leaderb]
                for k in groupb:
                    self.leader[k] = leadera
            else:
                self.group[leadera].add(b)
                self.leader[b] = leadera
        else:
            if leaderb is not None:
                self.group[leaderb].add(a)
                self.leader[a] = leaderb
            else:
                self.leader[a] = self.leader[b] = a
                self.group[a] = set([a, b])

    def connected(self, a, b):
        leadera = self.leader.get(a)
        leaderb = self.leader.get(b)
        if leadera is not None:
            if leaderb is not None:
                return leadera == leaderb
            else:
                return False
        else:
            return False

    def undo(self):        
        self.group = self.oldgroup.copy()
        self.leader = self.oldleader.copy()


def test():
    x = DisjointSet()
    x.add(0,1)
    x.add(0,2)
    x.add(3,4)
    x.undo()
    print x.leader
    print x.group

if __name__ == "__main__":
    test()

您也可以撤消上次添加。在您的情况下,您可以执行以下操作:

import DisjointSet
a = [(1, 5), (4, 2), (4, 3), (5, 4), (6, 3), (7, 6)]
d = DisjointSet()
for e in a:
    d.add(*e)
print d.group
print d.leader

答案 2 :(得分:0)

def pairs_to_whole(touching_pairs:list):
    out = []
    while len(touching_pairs)>0:
        first, *rest = touching_pairs
        first = set(first)

        lf = -1
        while len(first)>lf:
            lf = len(first)

            rest2 = []
            for r in rest:
                if len(first.intersection(set(r)))>0:
                    first |= set(r)
                else:
                    rest2.append(r)     
            rest = rest2

        out.append(first)
        touching_pairs = rest
    return out