【问题标题】:How to measure download speed and progress using requests?如何使用请求测量下载速度和进度?
【发布时间】:2014-01-15 01:34:41
【问题描述】:

我正在使用requests下载文件,但是对于大文件我每次都需要检查磁盘上文件的大小,因为我无法以百分比显示进度,我也想知道下载速度.我该怎么做呢?这是我的代码:

import requests
import sys
import time
import os

def downloadFile(url, directory) :
  localFilename = url.split('/')[-1]
  r = requests.get(url, stream=True)

  start = time.clock()
  f = open(directory + '/' + localFilename, 'wb')
  for chunk in r.iter_content(chunk_size = 512 * 1024) :
        if chunk :
              f.write(chunk)
              f.flush()
              os.fsync(f.fileno())
  f.close()
  return (time.clock() - start)

def main() :
  if len(sys.argv) > 1 :
        url = sys.argv[1]
  else :
        url = raw_input("Enter the URL : ")
  directory = raw_input("Where would you want to save the file ?")

  time_elapsed = downloadFile(url, directory)
  print "Download complete..."
  print "Time Elapsed: " + time_elapsed


if __name__ == "__main__" :
  main()

我认为一种方法是每次在for 循环中读取文件并根据标题Content-Length 计算进度百分比。但这对于大文件(大约 500MB)来说又是一个问题。还有其他方法吗?

【问题讨论】:

    标签: python python-2.7 progress python-requests download-speed


    【解决方案1】:

    请看这里:Python progress bar and downloads

    我认为代码应该是这样的,它应该以每秒字节数的形式显示自启动以来的平均速度

    import requests
    import sys
    import time
    
    def downloadFile(url, directory) :
      localFilename = url.split('/')[-1]
      with open(directory + '/' + localFilename, 'wb') as f:
        start = time.clock()
        r = requests.get(url, stream=True)
        total_length = r.headers.get('content-length')
        dl = 0
        if total_length is None: # no content length header
          f.write(r.content)
        else:
          for chunk in r.iter_content(1024):
            dl += len(chunk)
            f.write(chunk)
            done = int(50 * dl / total_length)
            sys.stdout.write("\r[%s%s] %s bps" % ('=' * done, ' ' * (50-done), dl//(time.clock() - start)))
            print ''
      return (time.clock() - start)
    
    def main() :
      if len(sys.argv) > 1 :
            url = sys.argv[1]
      else :
            url = raw_input("Enter the URL : ")
      directory = raw_input("Where would you want to save the file ?")
    
      time_elapsed = downloadFile(url, directory)
      print "Download complete..."
      print "Time Elapsed: " + time_elapsed
    
    
    if __name__ == "__main__" :
      main()
    

    【讨论】:

    • 这段代码看起来不错,但 IMO 它不会显示动态下载,因为当我们请求 requests.get(...) 时,它会下载整个文件,然后它会退出 get 函数。这是动态功能。
    • @sonukumar,请注意 get 调用 request.get(url , stream=True) 中的 stream 参数。查看the documentation
    • @freeforalltousez 计算下载百分比时乘以50是什么意思?
    • @Juancho 这是进度条的长度。请参阅链接的答案。
    【解决方案2】:

    使用io.Bytes(写入内存)的python3接受答案的改进版本,产生Mbps,支持ipv4/ipv6,大小和端口参数。

    import sys, time, io, requests
    
    def speed_test(size=5, ipv="ipv4", port=80):
        
        if size == 1024:
            size = "1GB"
        else:
            size = f"{size}MB"
    
        url = f"http://{ipv}.download.thinkbroadband.com:{port}/{size}.zip"
    
        with io.BytesIO() as f:
            start = time.clock()
            r = requests.get(url, stream=True)
            total_length = r.headers.get('content-length')
            dl = 0
            if total_length is None: # no content length header
                f.write(r.content)
            else:
                for chunk in r.iter_content(1024):
                    dl += len(chunk)
                    f.write(chunk)
                    done = int(30 * dl / int(total_length))
                    sys.stdout.write("\r[%s%s] %s Mbps" % ('=' * done, ' ' * (30-done), dl//(time.clock() - start) / 100000))
    
        print( f"\n{size} = {(time.clock() - start):.2f} seconds")
    

    用法示例:

    speed_test()
    speed_test(10)
    speed_test(50, "ipv6")
    speed_test(1024, port=8080)
    

    输出样本:

    [==============================] 61.34037 Mbps
    100MB = 17.10 seconds
    

    可用选项:

    大小:5, 10, 20, 50, 100, 200, 512, 1024

    ipv:ipv4, ipv6

    端口:80, 81, 8080

    【讨论】:

    • 函数time.clock() 已被删除,自 Python 3.3 起已被弃用:在上述解决方案代码中使用 time.perf_counter()
    猜你喜欢
    • 1970-01-01
    • 2019-07-07
    • 1970-01-01
    • 1970-01-01
    • 2012-08-28
    • 2011-09-13
    • 2015-07-28
    • 1970-01-01
    • 2018-11-08
    相关资源
    最近更新 更多