Python urllib2 未完成的下载文件答案

【问题标题】：Python urllib2 unfinished download filePython urllib2 未完成的下载文件
【发布时间】：2013-04-23 12:48:12
【问题描述】：

此脚本从网站下载文件，在大文件中存在问题，因为丢失数据包导致停止下载...这是代码：

def download(self):

    adres = r"http://example.com/100_MbFile.zip"
    local = adres.split('/')[-1].split('#')[0].split('?')[0]

    try:
        print "Przygotowanie do zapisania pliku " + local
        u = urllib2.urlopen(adres)
        f = open(local, 'wb')
        meta = u.info()
        file_size = int(meta.getheaders("Content-Length")[0])
        print("Downloading: {0} Bytes: {1}".format(adres, file_size))

        file_size_dl = 0
        block_sz = 8192
        while True:
            buffer = u.read(block_sz)
            if not buffer:
                break

            file_size_dl += len(buffer)
            f.write(buffer)
            p = float(file_size_dl) / file_size
            status = r"{0}  [{1:.2%}]".format(file_size_dl, p)
            status = status + chr(8)*(len(status)+1)
            sys.stdout.write(status)


        if file_size_dl == file_size:
            f.close()

知道如何下载大文件吗？

【问题讨论】：

你查看过这个帖子吗？ stackoverflow.com/questions/1979435/…

标签： python download urllib2

【解决方案1】：

对于在 Python 2 中下载和保存文件，您有几个选择...

你可以使用urllib： http://docs.python.org/2/library/urllib.html#urllib.urlretrieve

这基本上就是你所尝试的：

import urllib

filename = '100_MbFile.zip'
url = 'http://example.com/' + filename

urllib.urlretrieve(url, filename)

...或者您可以使用urllib2，并指定要读取的块大小（如您的示例代码中所示）。

import urllib2

filename = '100_MbFile.zip'
url = 'http://example.com/' + filename

req = urllib2.urlopen(url)
block_sz = 8192
with open(filename, 'wb') as f:
    while True:
        chunk = req.read(block_sz)
        if not chunk:
            break
        f.write(chunk)

注意：在 Python 3 中，标准库已经重新组织，这两个都可以在 urllib.request 中找到： http://docs.python.org/3.0/library/urllib.request.html

【讨论】：

此解决方案不起作用，因为检索不完整...我一直在寻找大文件的解决方案。
这应该适用于 any 大小的文件（甚至比您机器内存中的缓冲还大）。如果您没有获得完整的内容，则说明您的服务有问题。