如何让Python线程优雅地完成

时间:2013-07-09 17:01:21

标签: python multithreading python-2.7

我正在做一个涉及数据收集和日志记录的项目。我有2个线程在运行,一个集合线程和一个日志记录线程,都在main中启动。我正在尝试使用Ctrl-C允许程序正常终止。

我正在使用threading.Event向线程发出信号以结束各自的循环。它可以正常停止sim_collectData方法,但它似乎没有正确停止logData线程。永远不会执行Collection terminated print语句,程序就会停止。 (它没有结束,只是坐在那里)。

while中的第二个logData循环是为了确保记录队列中的所有内容。目标是让Ctrl-C立即停止收集线程,然后允许日志记录线程完成清空队列,然后才完全终止程序。 (现在,数据刚刚打印出来 - 最终会被记录到数据库中)。

我不明白为什么第二个线程永远不会终止。我基于我在这个答案上做了什么:Stopping a thread after a certain amount of time。我错过了什么?

def sim_collectData(input_queue, stop_event):
    ''' this provides some output simulating the serial
    data from the data logging hardware. 
    '''
    n = 0
    while not stop_event.is_set():
        input_queue.put("DATA: <here are some random data> " + str(n))
        stop_event.wait(random.randint(0,5))
        n += 1
    print "Terminating data collection..."
    return

def logData(input_queue, stop_event):
    n = 0

    # we *don't* want to loop based on queue size because the queue could
    # theoretically be empty while waiting on some data.
    while not stop_event.is_set():
        d = input_queue.get()
        if d.startswith("DATA:"):
            print d
        input_queue.task_done()
        n += 1

    # if the stop event is recieved and the previous loop terminates, 
    # finish logging the rest of the items in the queue.
    print "Collection terminated. Logging remaining data to database..."
    while not input_queue.empty():
        d = input_queue.get()
        if d.startswith("DATA:"):
            print d
        input_queue.task_done()
        n += 1
    return


def main():
    input_queue = Queue.Queue()

    stop_event = threading.Event() # used to signal termination to the threads

    print "Starting data collection thread...",
    collection_thread = threading.Thread(target=sim_collectData, args=(input_queue,     stop_event))
    collection_thread.start()
    print "Done."

    print "Starting logging thread...",
    logging_thread = threading.Thread(target=logData, args=(input_queue, stop_event))
    logging_thread.start()
    print "Done."

    try:
        while True:
        time.sleep(10)
    except (KeyboardInterrupt, SystemExit):
        # stop data collection. Let the logging thread finish logging everything in the queue
        stop_event.set()

 main()

4 个答案:

答案 0 :(得分:7)

问题是您的记录器正在d = input_queue.get()等待,并且不会检查该事件。一种解决方案是完全跳过事件并发明一条告诉记录器停止的唯一消息。当您收到信号时,将该消息发送到队列。

import threading
import Queue
import random
import time

def sim_collectData(input_queue, stop_event):
    ''' this provides some output simulating the serial
    data from the data logging hardware. 
    '''
    n = 0
    while not stop_event.is_set():
        input_queue.put("DATA: <here are some random data> " + str(n))
        stop_event.wait(random.randint(0,5))
        n += 1
    print "Terminating data collection..."
    input_queue.put(None)
    return

def logData(input_queue):
    n = 0

    # we *don't* want to loop based on queue size because the queue could
    # theoretically be empty while waiting on some data.
    while True:
        d = input_queue.get()
        if d is None:
            input_queue.task_done()
            return
        if d.startswith("DATA:"):
            print d
        input_queue.task_done()
        n += 1

def main():
    input_queue = Queue.Queue()

    stop_event = threading.Event() # used to signal termination to the threads

    print "Starting data collection thread...",
    collection_thread = threading.Thread(target=sim_collectData, args=(input_queue,     stop_event))
    collection_thread.start()
    print "Done."

    print "Starting logging thread...",
    logging_thread = threading.Thread(target=logData, args=(input_queue,))
    logging_thread.start()
    print "Done."

    try:
        while True:
            time.sleep(10)
    except (KeyboardInterrupt, SystemExit):
        # stop data collection. Let the logging thread finish logging everything in the queue
        stop_event.set()

main()

答案 1 :(得分:2)

我不是线程专家,但是在你的logData函数中,第一个d=input_queue.get()是阻塞的,即,如果队列为空,它将永远等待,直到收到队列消息。这可能是logData线程永远不会终止的原因,它正在等待队列消息。

请参阅[Python docs]将其更改为非阻塞队列read:use .get(False).get_nowait() - 但是当队列为空时,要么需要一些异常处理。< / p>

答案 2 :(得分:0)

您正在调用input_queue上的阻止获取而没有超时。在logData的任一部分中,如果您调用input_queue.get()且队列为空,则会无限期阻止,阻止logging_thread完成。

要解决此问题,您需要致电input_queue.get_nowait()或将超时时间传递给input_queue.get()

这是我的建议:

def logData(input_queue, stop_event):
    n = 0

    while not stop_event.is_set():
        try:
            d = input_queue.get_nowait()
            if d.startswith("DATA:"):
                print "LOG: " + d
                n += 1
        except Queue.Empty:
            time.sleep(1)
    return

您也发信号通知线程终止,但不等待它们这样做。请考虑在main函数中执行此操作。

try:
    while True:
        time.sleep(10)
except (KeyboardInterrupt, SystemExit):
    stop_event.set()
    collection_thread.join()
    logging_thread.join()

答案 3 :(得分:0)

基于tdelaney的答案,我创建了一个基于迭代器的方法。遇到终止消息时,迭代器退出。我还添加了一个计数器,其中包含当前阻塞的get个调用数和一个stop - 方法,该方法发送的终止消息数量一样多。为了防止递增和读取计数器之间的竞争条件,我在那里设置一个停止位。此外,我不使用None作为终止消息,因为在使用PriorityQueue时无法将其与其他数据类型进行比较。

有两个限制,我没有必要消除。对于一个stop - 方法,在关闭线程之前,首先等待队列为空。第二个限制是,我没有任何代码可以在stop之后重新使用队列。后者可能很容易添加,而前者需要注意并发性和使用代码的上下文。

您必须决定是否希望stop也等待消耗所有终止消息。我选择在那里放置必要的join,但您可以将其删除。

所以这就是代码:

import threading, queue

from functools import total_ordering
@total_ordering
class Final:
    def __repr__(self):
        return "∞"

    def __lt__(self, other):
        return False

    def __eq__(self, other):
        return isinstance(other, Final)

Infty = Final()

class IterQueue(queue.Queue):
    def __init__(self):
        self.lock = threading.Lock()
        self.stopped = False
        self.getters = 0
        super().__init__()

    def __iter__(self):
        return self

    def get(self):
        raise NotImplementedError("This queue may only be used as an iterator.")

    def __next__(self):
        with self.lock:
            if self.stopped:
                raise StopIteration
            self.getters += 1
        data = super().get()
        if data == Infty:
            self.task_done()
            raise StopIteration
        with self.lock:
            self.getters -= 1
        return data

    def stop(self):
        self.join()
        self.stopped = True
        with self.lock:
            for i in range(self.getters):
                self.put(Infty)
        self.join()

class IterPriorityQueue(IterQueue, queue.PriorityQueue):
    pass

哦,我在python 3.2写了这封信。所以在向后移植后,

import threading, Queue

from functools import total_ordering
@total_ordering
class Final:
    def __repr__(self):
        return "Infinity"

    def __lt__(self, other):
        return False

    def __eq__(self, other):
        return isinstance(other, Final)

Infty = Final()

class IterQueue(Queue.Queue, object):
    def __init__(self):
        self.lock = threading.Lock()
        self.stopped = False
        self.getters = 0
        super(IterQueue, self).__init__()

    def __iter__(self):
        return self

    def get(self):
        raise NotImplementedError("This queue may only be used as an iterator.")

    def next(self):
        with self.lock:
            if self.stopped:
                raise StopIteration
            self.getters += 1
        data = super(IterQueue, self).get()
        if data == Infty:
            self.task_done()
            raise StopIteration
        with self.lock:
            self.getters -= 1
        return data

    def stop(self):
        self.join()
        self.stopped = True
        with self.lock:
            for i in range(self.getters):
                self.put(Infty)
        self.join()

class IterPriorityQueue(IterQueue, Queue.PriorityQueue):
    pass

你会用它作为

import random
import time

def sim_collectData(input_queue, stop_event):
    ''' this provides some output simulating the serial
    data from the data logging hardware. 
    '''
    n = 0
    while not stop_event.is_set():
        input_queue.put("DATA: <here are some random data> " + str(n))
        stop_event.wait(random.randint(0,5))
        n += 1
    print "Terminating data collection..."
    return

def logData(input_queue):
    n = 0

    # we *don't* want to loop based on queue size because the queue could
    # theoretically be empty while waiting on some data.
    for d in input_queue:
        if d.startswith("DATA:"):
            print d
        input_queue.task_done()
        n += 1

def main():
    input_queue = IterQueue()

    stop_event = threading.Event() # used to signal termination to the threads

    print "Starting data collection thread...",
    collection_thread = threading.Thread(target=sim_collectData, args=(input_queue,     stop_event))
    collection_thread.start()
    print "Done."

    print "Starting logging thread...",
    logging_thread = threading.Thread(target=logData, args=(input_queue,))
    logging_thread.start()
    print "Done."

    try:
        while True:
            time.sleep(10)
    except (KeyboardInterrupt, SystemExit):
        # stop data collection. Let the logging thread finish logging everything in the queue
        stop_event.set()
        input_queue.stop()

main()