Question

系统：Ubuntu 18.04 LTS
的Python：3.6.8
OpenCV：3.4.4
CPU：Intel i5-7300HQ CPU @ 2.50GHz，4核
GPU：Nvidia GTX 1050移动版（4Gb）

我正在尝试使用PyTorch（1.3.1）训练简单的图像分类NN模型我有一堆图像（35.000张照片，175.6mb）存储在一个文件夹中。我走所有道路的方式：

def _get_imgs_paths(path):
        _, _, imgs_paths = list(os.walk(path))[0]
        return imgs_paths

调用另一个加载所有照片的函数后：

def _get_xys(path):
        imgs_paths = DataSet._get_imgs_paths(path)
        imgs_paths = sorted(imgs_paths)
        cats_paths = list(filter(lambda path: 'cat' in path, imgs_paths))
        dogs_paths = list(filter(lambda path: 'dog' in path, imgs_paths))
        noise_paths = list(filter(lambda path: 'noise' in path, imgs_paths))
        balanced_paths = []
        for i in range(len(imgs_paths)):
            if i % 3 == 0:
                balanced_paths.append(cats_paths.pop())
            elif i % 3 == 1:
                balanced_paths.append(dogs_paths.pop())
            else:
                balanced_paths.append(noise_paths.pop())
        xs = []
        ys = []

        **
        for img_path in balanced_paths:
            if 'cat' in img_path:
                y = global_config.dataset.CAT
            elif 'dog' in img_path:
                y = global_config.dataset.DOG
            elif 'noise' in img_path:
                y = global_config.dataset.NOISE
            else:
                raise Exception('No such class')
            ys.append(y)
            // seems like this call causes the slowdown
            img = cv2.imread(os.path.join(path, img_path))
            r, g, b = cv2.split(img)
            img = np.stack([r, g, b])
            xs.append(img)
        **
        return xs, ys

然后我们开始：将图像写入内存的过程最多需要5分钟。这太荒谬了，因为几天前一切都很快。有一天，我刚醒来，开始查看夜间火车的结果，然后又开始了学习过程，并注意到这种巨大的减速。无法了解正在发生的事情。我单击Ctrl + C并获得以下回溯：

我做了几次：一直在这里。我打开htop并注意到了这一点：

LMAO。我只是想重现问题，并且惊奇地它所做的一切都完美无缺，例如蛮快：

但是我停止了，再次运行并得到了我正在谈论的问题。顶部：

Google说红色D表示“不间断睡眠”。

所以，问题是：这里发生了什么以及为什么发生？

使用python的OpenCV读取文件花费的时间太长

0 个答案: