Question

我正在尝试将我的一个程序转换为使用多处理程序，最好是多处理池，因为这些程序似乎更简单。在高级别，该过程从图像创建补丁阵列，然后将它们传递到GPU以进行对象检测。 CPU和GPU部分各占4个，但CPU有8个内核，不需要等待GPU，因为数据通过GPU后无法对数据进行进一步的操作。

以下是我认为这应该如何工作的图表：

为了帮助我完成这个过程，我希望用我的实现的高级版本进行演示。假设我们正在循环浏览具有10个图像的文件夹中的图像列表。我们一次调整图像4的大小。然后我们将它们一次转换为黑白两种，我们可以将转换作为此过程的GPU部分。这是代码的样子：

def im_resize(im, num1, num2):
    return im.resize((num1, num2), Image.ANTIALIAS)

def convert_bw(im):
    return im.convert('L')

def read_images(path):
    imlist = []
    for pathAndFileName in glob.iglob(os.path.join(path, "*")):
        if pathAndFileName.endswith(tuple([".jpg", ".JPG"])):
            imlist.append(Image.open(pathAndFileName))
    return imlist


img_list = read_images("path/to/images/")
final_img_list = []

for image in img_list:

    # Resize needs to run concurrently on 4 processes so that the next img_tmp is always ready to go for convert
    img_tmp = im_resize(image, 100, 100)

    # Convert is limited, need to run on 2 processes
    img_tmp = convert_bw(img_tmp)
    final_img_list.append(img_tmp)

具体数量的进程等原因是系统性能指标，这将减少运行时间。我只是想确保GPU不必等待CPU完成处理图像，并且我希望有一个常量队列填充预处理的图像，以便GPU运行。我希望在队列上保持大约4-10个预处理图像的最大尺寸。如果你们可以帮助我说明如何通过这个简化的例子实现这一目标，我相信我可以弄清楚如何将它转化为我需要的东西。

谢谢！

Answer 1

这是试图实现你想要的尝试：

...

# Mapping functions can only take one arg, we provide tuple
def img_resize_splat(a):
    img_resize(*a)

if __name__=="__main__":
    # Make a CPU pool and a GPU pool
    cpu = Pool(4)
    gpu = Pool(2)

    # Hopefully this returns an iterable, and not a list with all images read into memory
    img_list = read_images("path/to/images/")

    # I'm assuming you want images to be processed as soon as ready, order doesn't matter
    resized = cpu.imap_unordered(img_resize_splat, ((img, 100, 100) for img in img_list))
    converted = gpu.imap_unordered(convert_bw, resized)

    # This is an iterable with your results, slurp them up one at a time
    for bw_img in converted:
        # do something

多处理的好例子实现？

1 个答案: