Question

我在Python中有一个脚本来压缩大字符串：

import zlib

def processFiles():
  ...
  s = """Large string more than 2Gb"""
  data = zlib.compress(s)
  ...

当我运行此脚本时，出现错误：

ERROR: Traceback (most recent call last):#012  File "./../commands/sce.py", line 438, in processFiles#012    data = zlib.compress(s)#012OverflowError: size does not fit in an int

一些信息：

zlib。版本 =＆＃39; 1.0＆＃39;

zlib.ZLIB_VERSION =＆＃39; 1.2.7＆＃39;

# python -V
Python 2.7.3

# uname -a
Linux app2 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64 GNU/Linux

# free
             total       used       free     shared    buffers     cached
Mem:      65997404    8096588   57900816          0     184260    7212252
-/+ buffers/cache:     700076   65297328
Swap:     35562236          0   35562236

# ldconfig -p | grep python
libpython2.7.so.1.0 (libc6,x86-64) => /usr/lib/libpython2.7.so.1.0
libpython2.7.so (libc6,x86-64) => /usr/lib/libpython2.7.so

如何在Python中压缩大数据（超过2Gb）？

Answer 1

我压缩大数据的功能：

    {php}
    $url = $this->get_video_meta('test'); 
    echo $url; 
    {/php}

Answer 2

这不是RAM问题。简单地说，zlib或python绑定都无法处理大于4GB的数据。

将数据拆分为4GB（或更小的块）并分别处理每个数据块。

Answer 3

尝试直播...

import zlib

compressor = zlib.compressobj()
with open('/var/log/syslog') as inputfile:
     data = compressor.compress(inputfile.read())

print data

在python中压缩大数据的麻烦

3 个答案: