Question

问题

我有一群人，我希望每个人与小组中的其他每个人进行1：1的会议。一个给定的人一次只能与另一个人见面，因此我想执行以下操作：

找到所有可能的配对组合
将组配对成“回合”会议，每个人只能参加一次回合，并且回合中应包含尽可能多的对，以满足最少回合中所有可能的配对组合。 li>

为了用所需的输入/输出来演示该问题，假设我有以下列表：

>>> people = ['Dave', 'Mary', 'Susan', 'John']

我想产生以下输出：

>>> for round in make_rounds(people):
>>>     print(round)
[('Dave', 'Mary'), ('Susan', 'John')]
[('Dave', 'Susan'), ('Mary', 'John')]
[('Dave', 'John'), ('Mary', 'Susan')]

如果我的人数为奇数，那么我期望得到这样的结果：

>>> people = ['Dave', 'Mary', 'Susan']
>>> for round in make_rounds(people):
>>>     print(round)
[('Dave', 'Mary')]
[('Dave', 'Susan')]
[('Mary', 'Susan')]

这个问题的关键是我需要我的解决方案表现出色（在合理范围内）。我编写的代码可行，但是随着people的大小增长，它的运行速度成指数增长。我对编写高性能算法的了解不足，无法知道我的代码是否效率低下，还是仅仅受问题参数的束缚

我尝试过的

第一步很简单：我可以使用itertools.combinations获得所有可能的配对：

>>> from itertools import combinations
>>> people_pairs = set(combinations(people, 2))
>>> print(people_pairs)
{('Dave', 'Mary'), ('Dave', 'Susan'), ('Dave', 'John'), ('Mary', 'Susan'), ('Mary', 'John'), ('Susan', 'John')}

要自己计算出回合，我正在构建一个回合，如下所示：

创建一个空的round列表
遍历使用上述people_pairs方法计算出的combinations集的副本
对于该对中的每个人，请检查当前round中是否有任何已经包含该个人的现有配对
如果已经有一对包含其中一个个体，请在本轮比赛中跳过该配对。如果不是，请将这对添加到回合中，然后从people_pairs列表中删除该对。
迭代完所有的人对之后，将回合添加到主rounds列表中
重新开始，因为people_pairs现在仅包含第一轮未进入的对

最终，这将产生所需的结果，并降低我的人员对，直到没有剩下的人，并且计算了所有回合。我已经知道这需要大量的迭代，但是我不知道这样做的更好方法。

这是我的代码：

from itertools import combinations

# test if person already exists in any pairing inside a round of pairs
def person_in_round(person, round):
    is_in_round = any(person in pair for pair in round)
    return is_in_round

def make_rounds(people):
    people_pairs = set(combinations(people, 2))
    # we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty
    while people_pairs:
        round = []
        # make a copy of the current state of people_pairs to iterate over safely
        for pair in set(people_pairs):
            if not person_in_round(pair[0], round) and not person_in_round(pair[1], round):
                round.append(pair)
                people_pairs.remove(pair)
        yield round

使用https://mycurvefit.com绘制100-300列表大小的方法的性能表明，计算1000人的列表轮数可能需要100分钟左右。有更有效的方法吗？

注意：实际上，我并不是要组织1000人的会议:)这只是一个简单的示例，它代表了我要解决的匹配/组合问题。

Answer 1

这是Wikipedia文章Round-robin tournament中描述的算法的实现。

from itertools import cycle , islice, chain

def round_robin(iterable):
    items = list(iterable)
    if len(items) % 2 != 0:
        items.append(None)
    fixed = items[:1]
    cyclers = cycle(items[1:])
    rounds = len(items) - 1
    npairs = len(items) // 2
    return [
        list(zip(
            chain(fixed, islice(cyclers, npairs-1)),
            reversed(list(islice(cyclers, npairs)))
        ))
        for _ in range(rounds)
        for _ in [next(cyclers)]
    ]

Answer 2

我只生成索引（因为我很难输入1000个名称=），但是对于1000个数字，运行时间约为4秒。

所有其他方法的主要问题-它们使用结对并与它们一起工作，结对很多，并且运行时间越来越长。我的方法与人合作而不是与他人合作有所不同。我有一个dict()，可将该人映射到他必须遇到的其他人的列表，并且这些列表的长度最多为N个项目（不是成对的N ^ 2）。因此节省了时间。

#!/usr/bin/env python

from itertools import combinations
from collections import defaultdict

pairs = combinations( range(6), 2 )

pdict = defaultdict(list)
for p in pairs :
    pdict[p[0]].append( p[1] )

while len(pdict) :
    busy = set()
    print '-----'
    for p0 in pdict :
        if p0 in busy : continue

        for p1 in pdict[p0] :
            if p1 in busy : continue

            pdict[p0].remove( p1 )
            busy.add(p0)
            busy.add(p1)
            print (p0, p1)

            break

    # remove empty entries
    pdict = { k : v for k,v in pdict.items() if len(v) > 0 }

'''
output:
-----
(0, 1)
(2, 3)
(4, 5)
-----
(0, 2)
(1, 3)
-----
(0, 3)
(1, 2)
-----
(0, 4)
(1, 5)
-----
(0, 5)
(1, 4)
-----
(2, 4)
(3, 5)
-----
(2, 5)
(3, 4)
'''

Answer 3

您可以立即做两件事：

不要每次都通过列表复制该集合。那是浪费大量的时间/内存。而是在每次迭代后修改一次集。
每个回合中要保留一组独立的人。在一个集合中查找一个人比遍历整个回合快一个数量级。

例如：

def make_rounds(people):
    people_pairs = set(combinations(people, 2))

    while people_pairs:
        round = set()
        people_covered = set()
        for pair in people_pairs:
            if pair[0] not in people_covered \
               and pair[1] not in people_covered:
                round.add(pair)
                people_covered.update(pair)
        people_pairs -= round # remove thi
        yield round

比较：

Answer 4

当您需要快速查找时，可以使用散列/字典。在dict而不是list中跟踪每个回合中谁的位置，这样会更快。

由于您正在使用算法，因此研究大的O符号将帮助您并了解哪种数据结构擅长于哪种操作也是关键。有关Python内置程序的时间复杂性，请参见本指南：https://wiki.python.org/moin/TimeComplexity。您会看到检查列表中的项目为O（n），这意味着它会根据输入的大小线性缩放。因此，由于它处于循环中，因此最终会得到O（n ^ 2）或更糟的结果。对于字典而言，查找通常为O（1），这意味着输入的大小无关紧要。

此外，不要覆盖内置函数。我已将round更改为round_

from itertools import combinations

# test if person already exists in any pairing inside a round of pairs
def person_in_round(person, people_dict):
    return people_dict.get(person, False)

def make_rounds(people):
    people_pairs = set(combinations(people, 2))
    people_in_round = {}
    # we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty
    while people_pairs:
        round_ = []
        people_dict = {}
        # make a copy of the current state of people_pairs to iterate over safely
        for pair in set(people_pairs):
            if not person_in_round(pair[0], people_dict) and not person_in_round(pair[1], people_dict):
                round_.append(pair)
                people_dict[pair[0]] = True
                people_dict[pair[1]] = True


                people_pairs.remove(pair)
        yield round_

Answer 5

也许我遗漏了一些东西（并非完全不常见），但这听起来像是一个古老的循环赛，每队每队只对战一次。

有O（n ^ 2）个方法可以“手动”处理，而“通过机器”可以正常工作。可以找到一个很好的描述in the Wikipedia article on Round-Robin Tournaments。

关于O（n ^ 2）：将有n-1或n个回合，每个回合需要O（n）步骤来轮换除一个表项之外的所有表项，并需要O（n）步骤来枚举{{1} }每轮比赛。您可以使用双向链接列表来使旋转为O（1），但匹配项的枚举仍然为O（n）。所以O（n）* O（n）= O（n ^ 2）。

Answer 6

这在我的计算机上大约需要45s

def make_rnds(people):
    people_pairs = set(combinations(people, 2))
    # we will remove pairings from people_pairs whilst we build rnds, so loop as long as people_pairs is not empty
    while people_pairs:
        rnd = []
        rnd_set = set()
        peeps = set(people)
        # make a copy of the current state of people_pairs to iterate over safely
        for pair in set(people_pairs):
            if pair[0] not in rnd_set and pair[1] not in rnd_set:
                rnd_set.update(pair)
                rnd.append(pair)

                peeps.remove(pair[0])
                peeps.remove(pair[1])

                people_pairs.remove(pair)
                if not peeps:
                    break
        yield rnd

我删除了函数person_in_rnd以减少因函数调用而浪费的时间，并添加了一个名为rnd_set和peeps的变量。 rnd_set是到目前为止所有成员的集合，用于检查与该对的比赛。 peeps是一组复制的人，每次我们向rnd添加一对时，我们都会从peeps中删除那些人。这样一来，当窥视为空时，也就是每个人都进入一个回合后，我们就可以停止遍历所有组合。

查找最有效的对

问题

我尝试过的

6 个答案: