多处理池卡住了

时间:2020-06-02 07:39:02

标签: numpy python-multiprocessing tensorflow2.0 nibabel

以下代码是从更长的脚本中提取的。顺序版本(无多重处理)可以正常工作。但是,当我使用Pool时,脚本会卡在特定的行中。

我想将相同的功能crop_image并行应用于从列表all_subdirsall_files中检索出的一组对象的某些医学成像体积。该函数从路径中使用nib加载主题卷,然后从中提取两个3D补丁:第一个补丁的形状为40x40x40,第二个补丁的形状为80x80x80。两个面片具有相同的中心。

在简化示例中,我仅加载两个主题。这两个过程都开始,导致函数内的print确实返回:

>>> sub-001_ses-20101210_brain.nii.gz
>>> sub-002_ses-20110815_brain.nii.gz

但是,当必须对80x80x80补丁执行tf.image.per_image_standardization时,程序将无限期挂起。我怀疑这是内存/空间问题,因为如果我也将大型补丁设置为40x40x40(或更低),则脚本可以正常运行。

我可以尝试什么?我在做错什么吗?

以下版本实际上可以工作,但是相对于实际版本却非常简化:

import nibabel as nib
import numpy as np
import tensorflow as tf


def crop_image(subdir_path, file_path):
    print(file_path)
    small_scale = []
    big_scale = []

    nii_volume = nib.load(os.path.join(subdir_path, file_path)).get_fdata()  # load volume with nibabel and extract np array

    rows_range, columns_range, slices_range = nii_volume.shape  # save volume dimensions

    for y in range(20, rows_range, 40):  # loop over rows
        for x in range(20, columns_range, 40):  # loop over columns
            for z in range(20, slices_range, 40):  # loop over slices
                small_patch = nii_volume[y - 20:y + 20, x - 20:x + 20, z - 20:z + 20]  # extract small patch
                big_patch = nii_volume[y - 40:y + 40, x - 40:x + 40, z - 40:z + 40]  # extract big patch
                small_patch = tf.image.per_image_standardization(small_patch)  # standardize small patch
                small_scale.append(small_patch)  # append small patch to external list

                # HERE THE CODE GETS STUCK AND EVERYTHING BELOW IS NOT EXECUTED

                big_patch = tf.image.per_image_standardization(big_patch)  # standardize big patch
                big_scale.append(big_patch)  # append big patch to external list

    # create tf.Dataset with lists (small_scale and big_scale)
    # etc..
    # etc..

    final_results = 1  # invented number for the example

    return final_results

if __name__ == '__main__':
    all_subdirs = ['/home/newuser/Desktop/sub-001/ses-20101210/anat', '/home/newuser/Desktop/sub-002/ses-20110815/anat']
    all_files = ['sub-001_ses-20101210_brain.nii.gz', 'sub-002_ses-20110815_brain.nii.gz']

    # DEFINE pool of processes
    num_workers = mp.cpu_count()  # save number of available CPUs (threads)
    pool = mp.Pool(processes=num_workers)  # create pool object and set as many processes as there are CPUs
    outputs = [pool.apply_async(crop_image, args=(path_pair[0], path_pair[1])) for path_pair in zip(all_subdirs, all_files)]

提前谢谢!

0 个答案:

没有答案