Question

考虑以下代码：

from threading import Thread
import numpy as np
import gc


class TargetClass:
    def __init__(self):
        self.lt = []

    def execute(self):
        print("list started")
        for idx in range(10):
            self.lt.append(np.linspace(0,100,1000000))
        print("list completed")
        self._flush()

    def _flush(self):
        self.lt.clear()
        self.lt = None
        # gc.collect()


def main():
    print("starting run")
    threads = []
    objs = []
    for idx in range(20):
        obj = TargetClass()
        objs.append(obj)
        th = Thread(target=obj.execute)
        threads.append(th)
    for th in threads:
        th.start()
    for th in threads:
        th.join()
    for obj in objs:
        del obj
    threads.clear()
    print("run end")


if __name__ == "__main__":
    for idx in range(5):
        main()

完成5次主要方法后，该进程使用的CPU RAM在850MB-1.1GB之间（随运行次数而变化）。我正在通过在pdb调用之后添加main在Ubuntu的系统监视器上检查此内存。但是，有趣的是，如果我在python终端中逐个运行execute方法的语句，然后再运行_flush方法的语句，那么python进程的内存约为16MB

我有两个相关的问题：

为什么即使在清除列表后，进程仍使用太多内存？
如何减少此内存消耗？（这是主要动机）

此脚本是我的应用程序中很小的一部分，其中我使用队列在多个线程之间传递数据。队列中每个数据的大小约为8MB。在处理完所有数据并删除了引用之后（据我所知），该进程的RAM使用率并未完全减少。我无法终止该进程并重新生成它，因为依赖项加载需要时间。

Python 3.6.8，NumPy 1.17.4

编辑1 ：如果列表中填充的是浮点数而不是NumPy数组对象，则清除列表后，内存将减少到16MB。

NumPy阵列内存泄漏

0 个答案: