Question

从大约1500万行（占用约250 MB）的pickle加载数据帧后，我对它执行一些搜索操作，然后删除一些行。在这些操作过程中，内存使用量猛增至5，有时甚至为7 GB，由于交换而烦人（我的笔记本电脑只有8 GB内存）。

关键是当操作完成时（即，执行下面代码中的最后两行时）不释放该存储器。所以Python进程仍然需要7 GB的内存。

知道为什么会这样吗？我正在使用Pandas 0.20.3。

下面的最小例子。实际上'数据'变量有大约1500万行，但我不知道如何在这里发布。

import datetime, pandas as pd

data = {'Time':['2013-10-29 00:00:00', '2013-10-29 00:00:08', '2013-11-14 00:00:00'], 'Watts': [0, 48, 0]}
df = pd.DataFrame(data, columns = ['Time', 'Watts'])
# Convert string to datetime
df['Time'] = pd.to_datetime(df['Time'])
# Make column Time as the index of the dataframe
df.index = df['Time']
# Delete the column time
df = df.drop('Time', 1)

# Get the difference in time between two consecutive data points
differences = df.index.to_series().diff()
# Keep only the differences > 60 mins
differences = differences[differences > datetime.timedelta(minutes=60)]
# Get the string of the day of the data points when the data gathering resumed
toRemove = [datetime.datetime.strftime(date, '%Y-%m-%d') for date in differences.index.date]

# Remove data points belonging to the day where the differences was > 60 mins
for dataPoint in toRemove:
    df.drop(df[dataPoint].index, inplace=True)

Answer 1

您可能想尝试调用垃圾收集器。 gc.collect() 有关详细信息，请参阅How can I explicitly free memory in Python?

熊猫 - 巨大的内存消耗

1 个答案: