在进程之间共享只读对象

时间:2014-11-16 21:21:01

标签: python multiprocessing

我在我的软件中使用“保持活动”流程模型(与Pipes通过主流程进行通信),我正在尝试在它们和主流程之间共享只读对象。一个显示我的问题的例子:

from multiprocessing import Process, Pipe#, Manager
from multiprocessing.connection import wait
import os


def start_in_oneshot_processes(obj, nb_process):
    """
    Start nb_process processes to do the job. Then process finish the job they die.
    """
    processes = []
    for i in range(nb_process):
        #  Simple process style
        p = Process(target=oneshot_in_process, args=(obj,))
        p.start()
        processes.append(p)

    for process in processes:
        # Wait all process finish
        process.join()

def oneshot_in_process(obj):
    """
    Main job (don't matter if in oneshot, keeped alive process. It have job to do)
    """
    print('p', obj, os.getpid())


def start_in_keepedalive_processes(obj, nb_process):
    """
    Start nb_process and keep them alive. Send job to them multiple times, then close thems.
    """
    processes = []
    readers_pipes = []
    writers_pipes = []
    for i in range(nb_process):
        # Start process with Pipes for communicate
        local_read_pipe, local_write_pipe = Pipe(duplex=False)
        process_read_pipe, process_write_pipe = Pipe(duplex=False)
        readers_pipes.append(local_read_pipe)
        writers_pipes.append(process_write_pipe)
        p = Process(target=run_keepedalive_process, args=(local_write_pipe, process_read_pipe, obj))
        p.start()
        processes.append(p)
    # Send to process some job to do
    for job in range(3):
        print('send new job to processes:')
        for process_number in range(nb_process):
            # Send data to process
            writers_pipes[process_number].send(obj)
            reader_useds = []
        # Wait response from processes
        while readers_pipes:
            for r in wait(readers_pipes):
                try:
                    r.recv()
                except EOFError:
                    pass
                finally:
                    reader_useds.append(r)
                    readers_pipes.remove(r)
        readers_pipes = reader_useds

    # Kill processes
    for writer_pipe in writers_pipes:
        writer_pipe.send('stop')

def run_keepedalive_process(main_write_pipe, process_read_pipe, obj):
    """
    Procees who don't finish while job to do
    """
    while obj != 'stop':
        oneshot_in_process(obj)
        # Send to main process "I've done my job"
        main_write_pipe.send('job is done')
        # Wait for new job to do (this part can be simplified no ?)
        readers = [process_read_pipe]
        while readers:
            for r in wait(readers):
                try:
                    obj = r.recv()
                except EOFError:
                    pass
                finally:
                    readers.remove(r)


obj = object()
print('m', obj, os.getpid())

print('One shot processes:')
start_in_oneshot_processes(obj, 5)

print('Keeped alive processes:')
start_in_keepedalive_processes(obj, 5)

print('f', obj, os.getpid())

输出是:

➜  sandbox git:(dev/opt) ✗ python3.4 sharedd.py
m <object object at 0xb7266dc8> 3225
One shot processes:
p <object object at 0xb7266dc8> 3227
p <object object at 0xb7266dc8> 3226
p <object object at 0xb7266dc8> 3229
p <object object at 0xb7266dc8> 3228
p <object object at 0xb7266dc8> 3230
Keeped alive processes:
p <object object at 0xb7266dc8> 3231
p <object object at 0xb7266dc8> 3232
send new job to processes:
p <object object at 0xb7266dc8> 3235
p <object object at 0xb7266dc8> 3233
p <object object at 0xb7266dc8> 3234
p <object object at 0xb7266488> 3231
send new job to processes:
p <object object at 0xb7266488> 3232
p <object object at 0xb7266488> 3235
p <object object at 0xb7266488> 3234
p <object object at 0xb7266490> 3231
p <object object at 0xb7266488> 3233
p <object object at 0xb7266490> 3232
p <object object at 0xb7266490> 3235
p <object object at 0xb7266490> 3233
send new job to processes:
p <object object at 0xb7266488> 3232
p <object object at 0xb7266488> 3235
p <object object at 0xb7266490> 3234
p <object object at 0xb7266488> 3231
f <object object at 0xb7266dc8> 3225
p <object object at 0xb7266488> 3233
p <object object at 0xb7266488> 3234

如果我创建简单进程(start_in_oneshot_processes),obj在子进程和主进程中具有相同的内存地址: 0xb7266dc8

但是当我的进程使用Pipe接收对象(start_in_keepedalive_processes)时,对象的内存地址与主进程不同:例如: 0xb7266488 而不是 0xb7266dc8 。 这些对象是子进程的只读对象。如何在主进程和子进程之间共享它们(并节省内存复制时间)?

1 个答案:

答案 0 :(得分:3)

您无法使用Pipe实现所需目标。当数据通过管道发送到另一个进程时,该进程必须在Python自己的地址空间---一个新对象中存储数据的副本。在流程之间传输数据的大多数其他方法都是相同的。

start_in_oneshot_process函数中看到相同的内存地址,这可能是由于您选择的操作系统所致。通常,两个进程根本不会共享相同的RAM。 (查看Contexts and Start Methods上的Process文档部分,了解Windows spawn和Unix fork之间的差异。)

如果确实重要,两个进程可以检查同一块内存,您可以尝试shared memory or an object manager process。请注意,那些相同的文档告诉您这很少是一个好主意。