python多处理池超时

时间:2016-08-02 04:36:18

标签: python multithreading multiprocessing python-multithreading python-multiprocessing

我想使用multiprocessing.Pool,但multiprocessing.Pool无法在超时后中止任务。我找到了solution,有些人修改了它。

from multiprocessing import util, Pool, TimeoutError
from multiprocessing.dummy import Pool as ThreadPool
import threading
import sys
from functools import partial
import time


def worker(y):
    print("worker sleep {} sec, thread: {}".format(y, threading.current_thread()))
    start = time.time()
    while True:
       if time.time() - start >= y:
           break
       time.sleep(0.5)
       # show work progress
       print(y)
    return y


def collect_my_result(result):
    print("Got result {}".format(result))


def abortable_worker(func, *args, **kwargs):
    timeout = kwargs.get('timeout', None)
    p = ThreadPool(1)
    res = p.apply_async(func, args=args)
    try:
        # Wait timeout seconds for func to complete.
        out = res.get(timeout)
    except TimeoutError:
        print("Aborting due to timeout {}".format(args[1]))
        # kill worker itself when get TimeoutError
        sys.exit(1)
    else:
        return out


def empty_func():
    pass


if __name__ == "__main__":
    TIMEOUT = 4
    util.log_to_stderr(util.DEBUG)
    pool = Pool(processes=4)

    # k - time to job sleep
    featureClass = [(k,) for k in range(20, 0, -1)]  # list of arguments
    for f in featureClass:
        # check available worker
        pool.apply(empty_func)

        # run job with timeout
        abortable_func = partial(abortable_worker, worker, timeout=TIMEOUT)
        pool.apply_async(abortable_func, args=f, callback=collect_my_result)

    time.sleep(TIMEOUT)
    pool.terminate()
    print("exit")

主要修改 - 使用 sys.exit(1)退出工作进程。这是杀死工作进程并杀死作业线程,但我不确定这个解决方案是好的。当流程通过正在运行的作业终止时,我可以获得哪些潜在的问题?

1 个答案:

答案 0 :(得分:5)

停止正在运行的作业没有隐含的风险,操作系统将负责正确终止该过程。

如果你的工作是写文件,你的磁盘上可能会有很多截断的文件。

如果您在DB上写或者您与某个远程进程连接,也可能会出现一些小问题。

尽管如此,Python标准池不支持超时,并且突然终止进程可能会导致应用程序中的奇怪行为。

Pebble处理池确实支持超时任务。

from pebble import process, TimeoutError

with process.Pool() as pool:
    task = pool.schedule(function, args=[1,2], timeout=5)

    try:
        result = task.get()
    except TimeoutError:
        print "Task: %s took more than 5 seconds to complete" % task