Question

通常，从服务器下载文件是这样的：

fp = open(file, 'wb')
req = urllib2.urlopen(url)
for line in req:
    fp.write(line)
fp.close()

下载后，下载过程刚刚完成，如果进程停止或中断，下载过程需要重新开始...所以，我想让我的程序暂停，然后继续下载，如何它真的有效吗？感谢。

Answer 1

Web服务器必须支持范围请求以允许暂停/恢复下载。

如果客户想要检索指定的字节，客户端会在请求中添加一个Range标头：

Range: bytes=0-999

服务器将返回部分内容响应，如下所示：

HTTP/1.0 206 Partial Content
Accept-Ranges: bytes
Content-Length: 1000
Content-Range: bytes 0-999/2200

balabalaa....

有关详细信息，请参阅http://www.w3.org/Protocols/rfc2616/rfc2616.html

Answer 2

在python中，你可以这样做：

import urllib, os

class myURLOpener(urllib.FancyURLopener):
    """Create sub-class in order to overide error 206.  This error means a
       partial file is being sent,
       which is ok in this case.  Do nothing with this error.
    """
    def http_error_206(self, url, fp, errcode, errmsg, headers, data=None):
        pass

loop = 1
dlFile = "2.6Distrib.zip"
existSize = 0
myUrlclass = myURLOpener()
if os.path.exists(dlFile):
    outputFile = open(dlFile,"ab")
    existSize = os.path.getsize(dlFile)
    #If the file exists, then only download the remainder
    myUrlclass.addheader("Range","bytes=%s-" % (existSize))
else:
    outputFile = open(dlFile,"wb")

webPage = myUrlclass.open("http://localhost/%s" % dlFile)

#If the file exists, but we already have the whole thing, don't download again
if int(webPage.headers['Content-Length']) == existSize:
    loop = 0
    print "File already downloaded"

numBytes = 0
while loop:
    data = webPage.read(8192)
    if not data:
        break
    outputFile.write(data)
    numBytes = numBytes + len(data)

webPage.close()
outputFile.close()

for k,v in webPage.headers.items():
    print k, "=", v
print "copied", numBytes, "bytes from", webPage.url

您可以找到来源：http://code.activestate.com/recipes/83208-resuming-download-of-a-file/

它仅适用于http dls

如何“暂停”和“恢复”下载工作？

2 个答案: