使用Python的Multiprocessing模块执行同时和单独的SEAWAT / MODFLOW模型运行

时间:2012-03-26 14:29:07

标签: python multiprocessing

我正在尝试在我的8位64位Windows 7计算机上完成100次模型运行。我想同时运行7个模型实例以减少我的总运行时间(每个模型运行大约9.5分钟)。我查看了几个与Python的Multiprocessing模块有关的线程,但我仍然遗漏了一些东西。

Using the multiprocessing module

How to spawn parallel child processes on a multi-processor system?

Python Multiprocessing queue

我的流程:

我有100个不同的参数集,我想通过SEAWAT / MODFLOW来比较结果。我为每个模型运行预先构建了模型输入文件,并将它们存储在自己的目录中。我希望能够做的是一次运行7个模型,直到完成所有实现。不需要在进程之间进行通信或显示结果。到目前为止,我只能按顺序生成模型:

import os,subprocess
import multiprocessing as mp

ws = r'D:\Data\Users\jbellino\Project\stJohnsDeepening\model\xsec_a'
files = []
for f in os.listdir(ws + r'\fieldgen\reals'):
    if f.endswith('.npy'):
        files.append(f)

## def work(cmd):
##     return subprocess.call(cmd, shell=False)

def run(f,def_param=ws):
    real = f.split('_')[2].split('.')[0]
    print 'Realization %s' % real

    mf2k = r'c:\modflow\mf2k.1_19\bin\mf2k.exe '
    mf2k5 = r'c:\modflow\MF2005_1_8\bin\mf2005.exe '
    seawatV4 = r'c:\modflow\swt_v4_00_04\exe\swt_v4.exe '
    seawatV4x64 = r'c:\modflow\swt_v4_00_04\exe\swt_v4x64.exe '

    exe = seawatV4x64
    swt_nam = ws + r'\reals\real%s\ss\ss.nam_swt' % real

    os.system( exe + swt_nam )


if __name__ == '__main__':
    p = mp.Pool(processes=mp.cpu_count()-1) #-leave 1 processor available for system and other processes
    tasks = range(len(files))
    results = []
    for f in files:
        r = p.map_async(run(f), tasks, callback=results.append)

我将if __name__ == 'main':更改为以下内容,希望它能解决我认为for loop在上述脚本中传达的缺乏并行性的问题。但是,模型甚至无法运行(没有Python错误):

if __name__ == '__main__':
    p = mp.Pool(processes=mp.cpu_count()-1) #-leave 1 processor available for system and other processes
    p.map_async(run,((files[f],) for f in range(len(files))))

非常感谢任何和所有帮助!

编辑3/26/2012 13:31 EST

使用@ J.F中的“手动池”方法。塞巴斯蒂安在下面的回答我得到了我的外部.exe的并行执行。模型实现一次批量调用8个,但是在调用下一个批次之前不等待那8个运行完成,依此类推:

from __future__ import print_function
import os,subprocess,sys
import multiprocessing as mp
from Queue import Queue
from threading import Thread

def run(f,ws):
    real = f.split('_')[-1].split('.')[0]
    print('Realization %s' % real)
    seawatV4x64 = r'c:\modflow\swt_v4_00_04\exe\swt_v4x64.exe '
    swt_nam = ws + r'\reals\real%s\ss\ss.nam_swt' % real
    subprocess.check_call([seawatV4x64, swt_nam])

def worker(queue):
    """Process files from the queue."""
    for args in iter(queue.get, None):
        try:
            run(*args)
        except Exception as e: # catch exceptions to avoid exiting the
                               # thread prematurely
            print('%r failed: %s' % (args, e,), file=sys.stderr)

def main():
    # populate files
    ws = r'D:\Data\Users\jbellino\Project\stJohnsDeepening\model\xsec_a'
    wdir = os.path.join(ws, r'fieldgen\reals')
    q = Queue()
    for f in os.listdir(wdir):
        if f.endswith('.npy'):
            q.put_nowait((os.path.join(wdir, f), ws))

    # start threads
    threads = [Thread(target=worker, args=(q,)) for _ in range(8)]
    for t in threads:
        t.daemon = True # threads die if the program dies
        t.start()

    for _ in threads: q.put_nowait(None) # signal no more files
    for t in threads: t.join() # wait for completion

if __name__ == '__main__':

    mp.freeze_support() # optional if the program is not frozen
    main()

没有错误回溯可用。 run()函数在调用单个模型实现文件时执行其职责,与多个文件一样。唯一的区别是,对于多个文件,它被称为len(files)次,但每个实例立即关闭,只允许一个模型运行完成,此时脚本正常退出(退出代码0)。

main()添加一些打印语句会显示有关活动线程计数和线程状态的一些信息(请注意,这只是对8个实现文件的测试,以使屏幕截图更易于管理,理论上所有8文件应该同时运行,但是行为会在它们产生的地方继续,并且除了一个之外立即死掉:

def main():
    # populate files
    ws = r'D:\Data\Users\jbellino\Project\stJohnsDeepening\model\xsec_a'
    wdir = os.path.join(ws, r'fieldgen\test')
    q = Queue()
    for f in os.listdir(wdir):
        if f.endswith('.npy'):
            q.put_nowait((os.path.join(wdir, f), ws))

    # start threads
    threads = [Thread(target=worker, args=(q,)) for _ in range(mp.cpu_count())]
    for t in threads:
        t.daemon = True # threads die if the program dies
        t.start()
    print('Active Count a',threading.activeCount())
    for _ in threads:
        print(_)
        q.put_nowait(None) # signal no more files
    for t in threads: 
        print(t)
        t.join() # wait for completion
    print('Active Count b',threading.activeCount())

screenshot

**读取“D:\\Data\\Users...”的行是手动停止模型运行完成时抛出的错误信息。一旦我停止模型运行,就会报告剩余的线程状态行并退出脚本。

EDIT 3/26/2012 16:24 EST

SEAWAT确实允许并发执行,因为我过去已经这样做了,使用iPython手动生成实例并从每个模型文件夹启动。这一次,我将从一个位置启动所有模型运​​行,即我的脚本所在的目录。看起来罪魁祸首可能是SEAWAT节省一些输出的方式。运行SEAWAT时,它会立即创建与模型运行相关的文件。其中一个文件未保存到模型实现所在的目录中,而是保存在脚本所在的顶级目录中。这可以防止任何后续线程在同一位置保存相同的文件名(他们都希望这样做,因为这些文件名是通用的,并且对每个实现都是非特定的)。 SEAWAT窗口没有保持打开足够长的时间让我阅读甚至看到有错误消息,我只是在我回去并尝试使用iPython运行代码时才意识到这一点,它直接显示来自SEAWAT的打印输出而不是打开一个运行程序的新窗口。

我接受@ J.F. Sebastian的回答是,一旦我解决了这个模型可执行问题,他提供的线程代码就会让我得到我需要的位置。

最终代码

在subprocess.check_call中添加了cwd参数,以在其自己的目录中启动SEAWAT的每个实例。很关键。

from __future__ import print_function
import os,subprocess,sys
import multiprocessing as mp
from Queue import Queue
from threading import Thread
import threading

def run(f,ws):
    real = f.split('_')[-1].split('.')[0]
    print('Realization %s' % real)
    seawatV4x64 = r'c:\modflow\swt_v4_00_04\exe\swt_v4x64.exe '
    cwd = ws + r'\reals\real%s\ss' % real
    swt_nam = ws + r'\reals\real%s\ss\ss.nam_swt' % real
    subprocess.check_call([seawatV4x64, swt_nam],cwd=cwd)

def worker(queue):
    """Process files from the queue."""
    for args in iter(queue.get, None):
        try:
            run(*args)
        except Exception as e: # catch exceptions to avoid exiting the
                               # thread prematurely
            print('%r failed: %s' % (args, e,), file=sys.stderr)

def main():
    # populate files
    ws = r'D:\Data\Users\jbellino\Project\stJohnsDeepening\model\xsec_a'
    wdir = os.path.join(ws, r'fieldgen\reals')
    q = Queue()
    for f in os.listdir(wdir):
        if f.endswith('.npy'):
            q.put_nowait((os.path.join(wdir, f), ws))

    # start threads
    threads = [Thread(target=worker, args=(q,)) for _ in range(mp.cpu_count()-1)]
    for t in threads:
        t.daemon = True # threads die if the program dies
        t.start()
    for _ in threads: q.put_nowait(None) # signal no more files
    for t in threads: t.join() # wait for completion

if __name__ == '__main__':
    mp.freeze_support() # optional if the program is not frozen
    main()

2 个答案:

答案 0 :(得分:15)

我没有在Python代码中看到任何计算。如果你只需要并行执行几个外部程序就可以使用subprocess运行程序和threading模块来维持运行的常数进程,但最简单的代码是使用{{1} }:

multiprocessing.Pool

如果文件很多,则#!/usr/bin/env python import os import multiprocessing as mp def run(filename_def_param): filename, def_param = filename_def_param # unpack arguments ... # call external program on `filename` def safe_run(*args, **kwargs): """Call run(), catch exceptions.""" try: run(*args, **kwargs) except Exception as e: print("error: %s run(*%r, **%r)" % (e, args, kwargs)) def main(): # populate files ws = r'D:\Data\Users\jbellino\Project\stJohnsDeepening\model\xsec_a' workdir = os.path.join(ws, r'fieldgen\reals') files = ((os.path.join(workdir, f), ws) for f in os.listdir(workdir) if f.endswith('.npy')) # start processes pool = mp.Pool() # use all available CPUs pool.map(safe_run, files) if __name__=="__main__": mp.freeze_support() # optional if the program is not frozen main() 可以替换为pool.map()

还有for _ in pool.imap_unordered(safe_run, files): pass提供与mutiprocessing.dummy.Pool相同的接口,但使用线程而不是在这种情况下可能更合适的进程。

您不需要保留一些CPU空闲。只需使用以低优先级启动可执行文件的命令(在Linux上它是multiprocessing.Pool程序)。

ThreadPoolExecutor example

concurrent.futures.ThreadPoolExecutor既简单又充足,但需要3rd-party dependency on Python 2.x(自Python 3.2起,它就在stdlib中)。

nice

或者,如果我们忽略#!/usr/bin/env python import os import concurrent.futures def run(filename, def_param): ... # call external program on `filename` # populate files ws = r'D:\Data\Users\jbellino\Project\stJohnsDeepening\model\xsec_a' wdir = os.path.join(ws, r'fieldgen\reals') files = (os.path.join(wdir, f) for f in os.listdir(wdir) if f.endswith('.npy')) # start threads with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor: future_to_file = dict((executor.submit(run, f, ws), f) for f in files) for future in concurrent.futures.as_completed(future_to_file): f = future_to_file[future] if future.exception() is not None: print('%r generated an exception: %s' % (f, future.exception())) # run() doesn't return anything so `future.result()` is always `None` 引发的异常:

run()

from itertools import repeat ... # the same # start threads with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor: executor.map(run, files, repeat(ws)) # run() doesn't return anything so `map()` results can be ignored + subprocess(手动池)解决方案

threading

答案 1 :(得分:1)

这是我维护内存中最小x个线程数的方法。它结合了线程和多处理模块。对于其他技术,如受人尊敬的其他成员已经解释过,可能是不寻常的,但可能是值得的。为了便于解释,我采用了一次抓取至少5个网站的方案。

所以这里是: -

#importing dependencies.
from multiprocessing import Process
from threading import Thread
import threading

# Crawler function
def crawler(domain):
    # define crawler technique here.
    output.write(scrapeddata + "\n")
    pass

接下来是threadController函数。此函数将控制到主存储器的线程流。它将继续激活线程以维持threadNum"最小值"限制即。 5.它也不会退出,直到所有活动线程(acitveCount)都完成。

它将保持最少的threadNum(5)startProcess函数线程(这些线程最终将从processList启动进程,同时加入时间超过60秒)。在启动threadController之后,会有2个线程不包含在上面的5个限制中。主线程和threadController线程本身。这就是为什么使用threading.activeCount()!= 2。

def threadController():
    print "Thread count before child thread starts is:-", threading.activeCount(), len(processList)
    # staring first thread. This will make the activeCount=3
    Thread(target = startProcess).start()
    # loop while thread List is not empty OR active threads have not finished up.
    while len(processList) != 0 or threading.activeCount() != 2:
        if (threading.activeCount() < (threadNum + 2) and # if count of active threads are less than the Minimum AND
            len(processList) != 0):                            # processList is not empty
                Thread(target = startProcess).start()         # This line would start startThreads function as a seperate thread **

startProcess函数作为一个单独的线程,将从进程列表中启动进程。这个函数的目的(**作为一个不同的线程开始)是它将成为Processes的父线程。因此,当它以60秒的超时加入它们时,这将阻止startProcess线程向前移动,但这不会停止执行threadController。所以这样,threadController将按需运行。

def startProcess():
    pr = processList.pop(0)
    pr.start()
    pr.join(60.00) # joining the thread with time out of 60 seconds as a float.

if __name__ == '__main__':
    # a file holding a list of domains
    domains = open("Domains.txt", "r").read().split("\n")
    output = open("test.txt", "a")
    processList = [] # thread list
    threadNum = 5 # number of thread initiated processes to be run at one time

    # making process List
    for r in range(0, len(domains), 1):
        domain = domains[r].strip()
        p = Process(target = crawler, args = (domain,))
        processList.append(p) # making a list of performer threads.

    # starting the threadController as a seperate thread.
    mt = Thread(target = threadController)
    mt.start()
    mt.join() # won't let go next until threadController thread finishes.

    output.close()
    print "Done"

除了在内存中保持最小线程数之外,我的目标是还可以避免内存中的线程或进程被卡住。我是使用超时功能完成的。 我为任何打字错误道歉。

我希望这个建筑可以帮助这个世界上的任何人。 问候, Vikas Gautam

相关问题