为什么Python2.7 dict比Python3 dict使用更多的空间?

时间:2017-07-24 15:09:04

标签: python python-2.7 python-3.x dictionary python-internals

我已经了解了实施Raymond Hettinger's new methodcompact dicts。这解释了为什么Python 3.6中的dicts使用的内存少于Python 2.7-3.5中的dicts。但是,Python 2.7和3.3-3.5 dicts中使用的内存之间似乎存在差异。测试代码:

import sys

d = {i: i for i in range(n)}
print(sys.getsizeof(d))
  • Python 2.7:12568
  • Python 3.5:6240
  • Python 3.6:4704

如上所述,我理解3.5和3.6之间的节省,但我对2.7和3.5之间节省的原因感到好奇。

1 个答案:

答案 0 :(得分:9)

原来这是一只红鲱鱼。增加dicts大小的规则在cPython 2.7 - 3.2和cPython 3.3之间以及cPython 3.4中再次发生变化(尽管此更改仅适用于删除时)。我们可以使用以下代码来确定dict扩展的时间:

import sys

size_old = 0
for n in range(512):
    d = {i: i for i in range(n)}
    size = sys.getsizeof(d)
    if size != size_old:
        print(n, size_old, size)
    size_old = size

Python 2.7:

(0, 0, 280)
(6, 280, 1048)
(22, 1048, 3352)
(86, 3352, 12568)

Python 3.5

0 0 288
6 288 480
12 480 864
22 864 1632
44 1632 3168
86 3168 6240

Python 3.6:

0 0 240
6 240 368
11 368 648
22 648 1184
43 1184 2280
86 2280 4704

请记住,当它们变为2/3满时,dicts会调整大小,我们可以看到cPython 2.7 dict在扩展时实现了四倍的大小,而cPython 3.5 / 3.6 dict实现的大小只有两倍。

dict source code

中的评论对此进行了解释
/* GROWTH_RATE. Growth rate upon hitting maximum load.
 * Currently set to used*2 + capacity/2.
 * This means that dicts double in size when growing without deletions,
 * but have more head room when the number of deletions is on a par with the
 * number of insertions.
 * Raising this to used*4 doubles memory consumption depending on the size of
 * the dictionary, but results in half the number of resizes, less effort to
 * resize.
 * GROWTH_RATE was set to used*4 up to version 3.2.
 * GROWTH_RATE was set to used*2 in version 3.3.0
 */