儿童过程挂多处理

时间:2015-08-13 11:22:17

标签: python python-2.7 parallel-processing multiprocessing python-multiprocessing

我遇到的问题是我的python应用程序中挂起了子进程,只有4/16进程已经完成所有这些进程正在添加到多处理队列中。 https://docs.python.org/3/library/multiprocessing.html#pipes-and-queues根据python docs:

  

警告

     

如上所述,如果子进程已将项目放入队列(和   它没有使用JoinableQueue.cancel_join_thread),然后是那个进程   直到所有缓冲的项目都被刷新为止后才会终止   管。

     

这意味着如果您尝试加入该过程,您可能会陷入僵局   除非您确定所有已放入队列的项目   已被消耗。同样,如果子进程是非守护进程的   然后父进程在尝试加入所有进程时可能会在退出时挂起   非守护儿童。

     

请注意,使用管理器创建的队列没有此问题。   请参阅编程指南。

我相信这可能是我的问题,但是在加入之前我从队列中取出了get()。我不确定我还能采取什么其他选择。

def RunInThread(dictionary):
    startedProcesses = list()
    resultList = list()
    output = Queue()
    scriptList = ThreadChunk(dictionary, 16) # last number determines how many threads

    for item in scriptList:
        if __name__ == '__main__':
            proc = Process(target=CreateScript, args=(item, output))
            startedProcesses.append(proc)
            proc.start()

    while not output.empty():
        resultList.append(output.get())

    #we must wait for the processes to finish before continuing
    for process in startedProcesses:
        process.join()
        print "finished"

#defines chunk of data each thread will process
def ThreadChunk(seq, num):
  avg = len(seq) / float(num)
  out = []
  last = 0.0

  while last < len(seq):
    out.append(seq[int(last):int(last + avg)])
    last += avg

  return out

def CreateScript(scriptsToGenerate, queue):
    start = time.clock()
    for script in scriptsToGenerate:
    ...
        queue.put([script['timeInterval'], script['script']])

    print time.clock() - start
    print "I have finished"

1 个答案:

答案 0 :(得分:1)

您的代码存在的问题是while not output.empty() 不可靠(请参阅empty)。您可能还会遇到解释器在您创建的进程完成初始化之前命中while not output.empty()的情况(因此Queue实际上是空的)。

由于您确切知道将在队列中放入多少项(即len(dictionnary)),您可以从队列中读取该项目数:

def RunInThread(dictionary):
    startedProcesses = list()
    output = Queue()
    scriptList = ThreadChunk(dictionary, 16) # last number determines how many threads

    for item in scriptList:
        proc = Process(target=CreateScript, args=(item, output))
        startedProcesses.append(proc)
        proc.start()

    resultList = [output.get() for _ in xrange(len(dictionary))]

    #we must wait for the processes to finish before continuing
    for process in startedProcesses:
        process.join()

    print "finished"

如果在某些时候你正在修改你的脚本并且不再知道将会产生多少项目,你可以在合理的超时时间内使用Queue.get

def RunInThread(dictionary):
    startedProcesses = list()
    resultList = list()
    output = Queue()
    scriptList = ThreadChunk(dictionary, 16) # last number determines how many threads

    for item in scriptList:
        proc = Process(target=CreateScript, args=(item, output))
        startedProcesses.append(proc)
        proc.start()

    try:
        while True:
            resultList.append(output.get(True, 2)) # block for a 2 seconds timeout, just in case
    except queue.Empty:
        pass # no more items produced

    #we must wait for the processes to finish before continuing
    for process in startedProcesses:
        process.join()

    print "finished"

您可能需要根据CreateScript中的实际计算时间调整超时。

相关问题