Question

我需要从zip文件中名为QuickLooks的文件夹中提取名为Preview.pdf的文件。

现在我的代码看起来有点像这样：

with ZipFile(newName, 'r') as newName:
        newName.extract(\QuickLooks\Preview.pdf)
        newName.close()

（在这种情况下，newName已设置为等于zip的完整路径。

重要的是要注意反斜杠在这种情况下是正确的，因为我在Windows上。

代码不起作用;这是它给出的错误：

Traceback (most recent call last):
  File "C:\Users\Asit\Documents\Evam\Python_Scripts\pageszip.py", line 18, in <module>
    ZF.extract("""QuickLooks\Preview.pdf""")
  File "C:\Python33\lib\zipfile.py", line 1019, in extract
    member = self.getinfo(member)
  File "C:\Python33\lib\zipfile.py", line 905, in getinfo
    'There is no item named %r in the archive' % name)
KeyError: "There is no item named 'QuickLook/Preview.pdf' in the archive"

我正在从Notepad ++中运行Python脚本，并从其控制台获取输出。

我该如何做到这一点？

或者，如何提取整个QuickLooks文件夹，移出Preview.pdf，然后删除文件夹及其余内容？

仅针对上下文，这是脚本的其余部分。这是一个获取.pages文件的PDF的脚本。我知道那里有bonified转换器;我只是将其作为一种具有某种实际应用的练习。

import os.path
import zipfile
from zipfile import *
import sys

file = raw_input('Enter the full path to the .pages file in question. Please note that file and directory names cannot contain any spaces.')
dir = os.path.abspath(os.path.join(file, os.pardir))
fileName, fileExtension = os.path.splitext(file)
if fileExtension == ".pages":
    os.chdir(dir)
    print (dir)
    fileExtension = ".zip"
    os.rename (file, fileName + ".zip")
    newName = fileName + ".zip"  #for debugging purposes
    print (newName) #for debugging purposes
    with ZipFile(newName, 'w') as ZF:
        print("I'm about to list names!")
        print(ZF.namelist()) #for debugging purposes
        ZF.extract("QuickLook/Preview.pdf")
    os.rename('Preview.pdf', fileName + '.pdf')
    finalPDF = fileName + ".pdf"
    print ("Check out the PDF! It's located at" + dir +  finalPDF + ".")
else:
    print ("Sorry, this is not a valid .pages file.")
    sys.exit

我不确定Zipfile的导入是否多余;我在另一篇SO帖子中读到，使用from zipfile import *比使用import zipfile更好。我不确定，所以我用了两个。 =）

编辑：我已经更改了代码以反映Blckknght建议的更改。

Answer 1

这里似乎有用。你的代码有几个问题。正如我在评论中提到的，必须以模式'r'打开zip文件才能读取它。另一个原因是zip存档成员名称在其路径名中始终使用正斜杠/字符作为分隔符（请参阅PKZIP Application Note的4.4.17.1节）。重要的是要注意，无法使用Python的当前zipfile模块将嵌套的归档成员提取到不同的子目录。你可以控制根目录，但不能在它下面（即 zip中的任何子文件夹）。

最后，由于没有必要将.pages文件重命名为.zip - 您传递的文件名ZipFile()可以包含任何扩展名 - 我从代码中删除了所有内容。但是，为了克服将成员提取到不同子目录的限制，我必须添加代码以首先将目标成员提取到临时目录，然后将其复制到最终目标。当然，之后，需要删除此临时文件夹。所以我不确定最终结果会更简单...

import os.path import shutil import sys import tempfile from zipfile import ZipFile PREVIEW_PATH = 'QuickLooks/Preview.pdf' # archive member path pages_file = input('Enter the path to the .pages file in question: ') #pages_file = r'C:\Stack Overflow\extract_test.pages' # hardcode for testing pages_file = os.path.abspath(pages_file) filename, file_extension = os.path.splitext(pages_file) if file_extension == ".pages": tempdir = tempfile.gettempdir() temp_filename = os.path.join(tempdir, PREVIEW_PATH) with ZipFile(pages_file, 'r') as zipfile: zipfile.extract(PREVIEW_PATH, tempdir) if not os.path.isfile(temp_filename): # extract failure? sys.exit('unable to extract {} from {}'.format(PREVIEW_PATH, pages_file)) final_PDF = filename + '.pdf' shutil.copy2(temp_filename, final_PDF) # copy and rename extracted file # delete the temporary subdirectory created (along with pdf file in it) shutil.rmtree(os.path.join(tempdir, os.path.split(PREVIEW_PATH)[0])) print('Check out the PDF! It\'s located at "{}".'.format(final_PDF)) #view_file(final_PDF) # see Bonus below else: sys.exit('Sorry, that isn\'t a .pages file.')

加分：如果您想从脚本中实际查看最终的pdf文件，可以添加以下函数并在最终创建的pdf上使用它（假设您有一个PDF查看器应用程序）安装在你的系统上）：

import subprocess def view_file(filepath): subprocess.Popen(filepath, shell=True).wait()

如何在zip文件夹中提取文件？

1 个答案: