将文件从ZIP直接提取到另一个ZIP

时间:2017-02-17 15:03:38

标签: python python-3.x archive zipfile

我的目标是从Zip存档中提取某些文件并将它们直接传输到另一个Zip,而不必对磁盘执行中间提取。

到目前为止,我有:

from zipfile import ZipFile, ZIP_DEFLATED


def stream_conents(src_zip, dst_zip, file_subset_list):
    with ZipFile(src_zip, "r", compression=ZIP_DEFLATED) as src_zip_archive:
        with ZipFile(dst_zip, "w", compression=ZIP_DEFLATED) as dst_zip_archive:
            for zitem in src_zip_archive.namelist():
                if zitem in file_subset_list:
                    zitem_object = src_zip_archive.open(zitem)
                    dst_zip_archive.write(zitem_object, zitem, )

但它只是抛出TypeError: argument should be string, bytes or integer, not ZipExtFile

1 个答案:

答案 0 :(得分:4)

您可以将整个文件读入内存并使用writestr来编写存档。

def stream_conents(src_zip, dst_zip, file_subset_list):
    with ZipFile(src_zip, "r", compression=ZIP_DEFLATED) as src_zip_archive:
        with ZipFile(dst_zip, "w", compression=ZIP_DEFLATED) as dst_zip_archive:
            for zitem in src_zip_archive.namelist():
                if zitem in file_subset_list:
                    # warning, may blow up memory
                    dst_zip_archive.writestr(zitem,
                        src_zip_archive.read(zitem))

从python 3.6开始,ZipFile.open将以写入模式打开存档文件。这允许您以块的形式编写文件并减少总体内存使用量。

def stream_conents(src_zip, dst_zip, file_subset_list):
    with ZipFile(src_zip, "r", compression=ZIP_DEFLATED) as src_zip_archive:
        with ZipFile(dst_zip, "w", compression=ZIP_DEFLATED) as dst_zip_archive:
            for zitem in src_zip_archive.namelist():
                if zitem in file_subset_list:
                    if sys.version_info >= (3, 6):
                        with src_zip_archive.open(zitem) as from_item:
                            with dst_zip_archive.open(zitem, "w") as to_item:
                                shutil.copyfileobj(from_item, to_item)
                    else:
                        # warning, may blow up memory
                        dst_zip_archive.writestr(zitem, 
                            src_zip_archive.read(zitem))