PriorityQueue非常慢

时间:2013-06-26 19:09:15

标签: python performance python-2.7 queue priority-queue

我正在实现一个数学函数,我需要一个优先级队列。我使用此页面上的代码:

class MyPriorityQueue(PriorityQueue):

    def __init__(self):
        PriorityQueue.__init__(self)
        self.counter = 0

    def put(self, item, priority):
        PriorityQueue.put(self, (priority, self.counter, item))
        self.counter += 1

    def get(self, *args, **kwargs):

        if self.counter == 0:
            return None

        _, _, item = PriorityQueue.get(self, *args, **kwargs)
        self.counter -= 1
        return item

    def empty(self):

        if self.counter == 0:
            return True

        return False

众所周知,python很慢,但看到结果我意识到出队消耗了总执行时间的28%。有人有任何建议吗?

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    34                                               @profile
    35                                               def solution(self):
    36                                           
    37         1           11     11.0      0.0          root = Node()
    38         1            2      2.0      0.0          root.capacity = self.K - root.size
    39         1           65     65.0      0.0          root.estimated = self.bound(root.level, root.size, root.value)
    40         1            4      4.0      0.0          root.copyList(None)
    41         1           37     37.0      0.0          self.queue.put(root, -0)
    42                                           
    43     99439       389936      3.9      2.3          while not self.queue.empty():
    44                                           
    45     99438      4666742     46.9     28.0              node = self.queue.get()
    46                                           
    47     99438       272335      2.7      1.6              if node.estimated > self.maxValue:
    48                                           

更新

使用heapq减少了近一半

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    67                                               @profile
    68                                               def solution(self):
    69                                           
    70         1           13     13.0      0.0          root = Node(0, 0, 0)
    71         1            2      2.0      0.0          root.capacity = self.K - root.size
    72         1           70     70.0      0.0          root.estimated = self.bound(root.level, root.size, root.value)
    73         1            5      5.0      0.0          root.copyList(None)
    74         1            5      5.0      0.0          heappush(self.queue, (-0, root))
    75                                           
    76     99439       171957      1.7      1.5          while self.queue:
    77                                           
    78     99438      2488221     25.0     21.7              node = heappop(self.queue)[1]
    79                                           
    80     99438       227451      2.3      2.0              if node.estimated > self.maxValue:

有没有办法优化这个循环?

 while k < self.numItems:
            estimated += self.items[k].value
            totalSize += self.items[k].weight
            k += 1

2 个答案:

答案 0 :(得分:4)

您可以使用heapq模块。

只要您不使用多线程,它就可以执行您想要的操作,并且可能比其他优先级队列更快。

heap = []            # creates an empty heap
heappush(heap, item) # pushes a new item on the heap
item = heappop(heap) # pops the smallest item from the heap
item = heap[0]       # smallest item on the heap without popping it
heapify(x)           # transforms list into a heap, in-place, in linear time

以下是一个例子:

>>> from heapq import *
>>> l = []
>>> heappush(l, (4, 'element')) # priority, element
>>> l
[(4, 'element')]
>>> heappush(l, (3, 'element2'))
>>> l
[(3, 'element2'), (4, 'element')]
>>> heappush(l, (5, 'element3'))
>>> l
[(3, 'element2'), (4, 'element'), (5, 'element3')]
>>> heappop(l)
(3, 'element2')
>>> heappop(l)
(4, 'element')
>>> heappop(l)
(5, 'element3')

len(l)可用于确定内部元素的数量。

l只有整数时,你提到的循环应该是这样的:

l = [(3, 1000), (4, 2000), (5, 500)]
estimated = sum(t[1] for t in l)
totalSize = sum(t[0] for t in l)

<强>替代

如果您有少量优先级和大量元素,那么存储桶就会很好。 {priority : [queue]}

答案 1 :(得分:1)

while k < self.numItems:
    estimated += self.items[k].value
    totalSize += self.items[k].weight
    k += 1  

==  

estimated = sum(item.value for item in self.items)
totalSize = sum(item.weight for item in self.items)