解释python模块itertools的组合功能

时间:2014-07-23 10:23:24

标签: python combinations itertools

我经常在Python中使用itertools模块,但如果我不知道它背后的逻辑,那就像是作弊。

以下是在订单不重要时查找字符串组合的代码。

def combinations(iterable, r):
    # combinations('ABCD', 2) --> AB AC AD BC BD CD
    # combinations(range(4), 3) --> 012 013 023 123
    pool = tuple(iterable)
    n = len(pool)
    if r > n:
        return
    indices = list(range(r))
    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != i + n - r:
                break
        else:
            return
        indices[i] += 1
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1
        yield tuple(pool[i] for i in indices)

有人可以解释基本想法吗?特别是在第14行

2 个答案:

答案 0 :(得分:5)

def combinations(iterable, r):
    # combinations('ABCD', 2) --> AB AC AD BC BD CD
    # combinations(range(4), 3) --> 012 013 023 123
    pool = tuple(iterable)
    # first you create a tuple of the original input which you can refer later with 
    # the corresponding indices
    n = len(pool)
    # get the length of the tuple
    if r > n:
        return
    # if the length of the desired permutation is higher than the length of the tuple 
    # it is not possible to create permutations so return without doing something

    indices = list(range(r))
    # create the first list of indices in normal order ( indices = [0,1,2,3,...,r])
    # up to the desired range r

    yield tuple(pool[i] for i in indices)
    # return the first permutation which is a tuple of the input with the original 
    # indices up to r tuple(tuple[0], tuple[1],....,tuple[r])

    while True:
        for i in reversed(range(r)):
            # i will go from r-1, r-2, r-3, ....,0

            if indices[i] != i + n - r:
                # if condition is true except for the case 
                # that at the position i in the tuple the last possible 
                # character appears then it is equal and proceed with the character 
                # before which means that this character is replaced by the next 
                # possible one

                # example: tuple='ABCDE' so n = 5, r=3 indices is [0,1,2] at start i=2
                # yield (A,B,C)
                # indices[i] is 2 and checks if 2 != 4 (2 +5-3) is true and break
                # increase indices[i]+1 and yield (A,B,D)
                # indices[i] is 3 and checks if 3 != 4 (2 +5-3) is true and break
                # increase indices[i]+1 and yield (A,B,E) 
                # indices[i] is 4 and checks if 4 != 4 (2 +5-3) is false so next loop 
                # iteration:  i = 1 indices[i] is 1 and checks if 4 != 3 (1 +5-3) 
                # is true and break .... and so on

                break
        else:
            # when the forloop completely finished then all possible character 
            # combinations are processed and the function ends
            return

        indices[i] += 1 # as written proceed with the next character which means the 
                        # index at i is increased
        for j in range(i+1, r): 
            indices[j] = indices[j-1] + 1 # all the following indexes are increased as 
                                          # well since we only want to at following 
                                          # characters and not at previous one or the
                                          # same which is index at indice[i]
        yield tuple(pool[i] for i in indices)
        # return the new tuple

答案 1 :(得分:1)

def combinations(iterable, r):
    # first, we need to understand, this function is to record every possibility of indices
    # then return the elements with the indices

    pool = tuple(iterable)

    n = len(pool)

    if r > n:
        return
    indices = list(range(r))

    # yield the first permutation, 
    # cause in the "while" circle, we will start to change the indices by plus 1 consistently
    # for example: iterable is [1, 2, 3, 4, 5], and r = 3
    # this yield will return [1, 2, 3], but in the "while" loop, 
    # we will start to update last elements' index to 4, which will return [1, 2, 4]
    yield tuple(pool[i] for i in indices)

    while True:

        # in this for loop, we want to confirm whether indices[i] can be increased or not
        for i in reversed(range(r)):

            # after reversed, i will be r-1, r-2, r-3, ....,0
            # something we should know before we start the 'for' loop
            # the value of indices[r-1] should not greater than n-1
            # the value of indices[r-2] should not greater than n-2
            # and the maximum of indices[i] should be indices[r-1]
            # so the value of indices[r-1] should between r-1 and n-r + r-1, like this:
            #       r-1 <= indics[r-1] <= n-r + r-1
            # so, to r-2:
            #       r-2 <= indics[r-1] <= n-r + r-2
            # let's set i = r-1:
            #       i <= indices[i] <= n-r+i (n-1 is the maximum value)
            # since we will keep plusing the value of indices[i], let's ignore i <= indices[i]
            # and we just want to know if indices[i] can plus or not,
            # so indices[i] can be equal with n-r+i
            # then we got:
            #       indices[i] < i + n - r
            # the offical document give: indices[i] != i + n - r,
            # cause when indices[i] == i + n - r, it arrived the boundry, 
            # the "for" loop will get into indices[i-1], there will be no judgement for ">i+n-r"
            # so to use indices[i] != i + n - r is still a good way, 
            # but i prefer indices[i] < i + n - r, which is easier to understand for me.
            # so next question is "break" in here, 
            # it means the value of indices[i] doesn't reach the boundry still can plus one,
            # let break out to continue the iteration
            # when it hit the boundry, i will be r-2
            # So we can see the result:
            # 1, 2, 3
            # 1, 2, 4
            # 1, 2, 5
            # 1, 3, 4
            # always loop the last index, hit the boundry, check the last but one.
            if indices[i] < i + n - r:
                break
        else:
            # loop finished, return
            return

        # first of all, plus one for current indices[i], 
        # that's why we yield the first permutation, before the while loop
        # and increase every indices[i] by 1
        indices[i] = indices[i] + 1
        # this for loop is increase every indices which is after indices[i].
        # cause, current index increased, and we need to confirm every one behind is orderd
        # for example: current we got i = 2, indices[i]+1 will be 3, 
        # so the next loop will start with [1, 3, 4], not [1, 3, 3]
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1

        yield tuple(pool[i] for i in indices)