并行执行此过程的最佳方法是什么

时间:2019-02-15 22:11:50

标签: python python-3.x multithreading parallel-processing multiprocessing

我一直在尝试并行化类方法中的进程。当我尝试使用Pool()中的multiprocessing时,会出现酸洗错误。当我使用Pool()中的multiprocessing.dummy时,我的执行要比序列化的执行慢。

我已尝试使用Stackoverflow帖子作为指导,对下面的代码进行几种变体,但是,没有一种方法能成功解决上述问题。

一个例子:如果将process_function移到类定义上方(对其进行全局化),则该操作无效,因为我无法访问对象属性。

无论如何,我的代码类似于:

from multiprocessing.dummy import Pool as ThreadPool
from my_other_module import other_module_class   

class myClass:

    def __init__(self, some_list, number_iterations):

        self.my_interface = other_module_class
        self.relevant_list = []
        self.some_list = some_list
        self.number_iterations = number_iterations
        # self.other_attributes = stuff from import statements

    def load_relevant_data:

        self.relevant_list = self.interface.other_function

    def compute_foo(self, relevant_list_member_value):

        # math involving class attributes

        return foo_scalar

    def higher_function(self):

        self.relevant_list = self.load_relevant_data
        np.random.seed(0)
        pool = ThreadPool() # I've tried different args here, no help
        pool.map(self.process_function, self.relevant_list)

    def process_function(self, dict_from_relevant_list):

        foo_bar = self.compute_foo(dict_from_relevant_list['key'])
        a = 0
        for i in some_other_list:

            # do other stuff involving class attributes and foo_bar
            # a = some of that

        dict_from_relevant_list['other_key'] = a


if __name__ == '__main__':

    import time
    import pprint as pp

    some_list = blah
    number_of_iterations = 10**4
    my_obj = myClass(some_list, number_of_iterations
    my_obj.load_third_parties()
    start = time.time()
    my_obj.higher_function()
    execution_time = time.time() - start
    print()
    print("Execution time for %s simulation runs: %s" % (number_of_iterations, execution_time))
    print()
    pp.pprint(my_obj.relevant_list[0:5])

我在相关列表中有几百本字典。我只想在我最内层的循环中,从计算量大的模拟中填充每个字典的'other_key'字段,这会产生标量值,例如上面的a。似乎应该有一种简单的方法来执行此操作,因为在Matlab中我可以正确地parfor,并且它是自动完成的。也许这种本能对Python是错误的。

0 个答案:

没有答案