【问题标题】:How to upload large (~100Mb) files?如何上传大(~100Mb)文件?
【发布时间】:2020-06-05 08:19:14
【问题描述】:

我有以下烧瓶应用程序

fileform.html:

<html>
    <head>
        <title>Simple file upload using Python Flask</title>
    </head>
    <body>
        <form action="/getSignature" method="post" enctype="multipart/form-data">
          Choose the file: <input type="file" name="photo"/><BR>
              <input type="submit" value="Upload"/>
        </form>
    </body>
</html>

app.py:

import os
from flask import Flask, request, render_template, url_for, redirect


app = Flask(__name__)


@app.route("/")
def fileFrontPage():
    return render_template('fileform.html')

@app.route("/getSignature", methods=['POST'])
def handleFileUpload():
    if 'photo' in request.files:
        photo = request.files['photo']
        if photo.filename != '':
            filepath = os.path.join('/flask/files', photo.filename)
            photo.save(filepath)
    return render_template('result.html')

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

这适用于较小的文件,但不适用于较大的文件。 单击上传按钮后,浏览器显示(Uploading 13%..),然后浏览器超时并显示ERR_CONNECTION_RESET。在烧瓶应用程序中看不到任何错误。

我通过 Nginx 提供服务。当我检查 nginx 日志时,我看到了

2020/02/20 22:38:47 [error] 6#6: *170614 client intended to send too large body: 80762097 bytes, client: 10.2.16.178, server: localhost, request: "POST /getSignature

我需要为此添加任何 Nginx 配置吗?

我想上传最大为 100Mb 的文件。

【问题讨论】:

  • 你是如何运行 app.py 的?在控制台/ide 中还是通过 apache/nginx 之类的 Web 服务器运行?
  • 请不要说你使用的是开发服务器

标签: python flask


【解决方案1】:

内容长度可能被限制为默认值。

当您使用 nginx 时,可能是上游超时它玩游戏(上游keepalive_timeout 60s 的默认值) 将以下设置添加到 nginx:

keepalive_timeout 900s;

#extra
proxy_read_timeout 900s;
uwsgi_read_timeout 900s;

更多调音here

如果你使用像 uwsgi 这样的 python wsgi 服务器,还要在那里设置keep alive

http-keepalive 900;

【讨论】:

  • 在 Nginx 配置中添加 client_max_body_size 100M; 解决了这个问题
【解决方案2】:

来自https://flask.palletsprojects.com/en/1.1.x/patterns/fileuploads/

连接重置问题

使用本地开发服务器时,您 可能会收到连接重置错误而不是 413 响应。你会 使用生产运行应用程序时获得正确的状态响应 WSGI 服务器。

解决方法:不要使用开发服务器,设置一个真正的WSGI服务器

【讨论】:

  • 链接到部署选项,我很乐意投票
  • flask.palletsprojects.com/en/1.1.x/deploying 但我觉得这个问题还有其他问题
  • @roganjosh 事实证明它不相关,因为 OP 在 nginx 后面运行,这就是连接重置发生的地方
【解决方案3】:

您是否研究过分块?它允许您将文件和数据分解成离散的部分以通过网络发送。您将拥有两个主要部分:前端和后端。对于前端,您可以使用Dropzone.js. 之类的东西,但是您需要启用分块行为,因为默认情况下它不包括在内。幸运的是,它真的很容易启用。

这可能是这样完成的:

<html>
    <meta charset="UTF-8">
    <link rel="stylesheet"
     href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.css"/>

    <link rel="stylesheet"
     href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/basic.min.css"/>

    <script type="application/javascript"
     src="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.js">
    </script>
    <head>
        <title>Simple file upload using Python Flask</title>
    </head>
        <form method="POST" action='/upload' class="dropzone dz-clickable"
                    id="dropper" enctype="multipart/form-data">
        </form>

        <script type="application/javascript">
                Dropzone.options.dropper = {
                        paramName: 'file',
                        chunking: true,
                        forceChunking: true,
                        url: '/upload',
                        maxFilesize: 1025, // megabytes
                        chunkSize: 1000000 // bytes
                }
        </script>
</html>

下面的烧瓶示例将处理后端:

import logging
import os

from flask import render_template, Blueprint, request, make_response
from werkzeug.utils import secure_filename

from pydrop.config import config

blueprint = Blueprint('templated', __name__, template_folder='templates')

log = logging.getLogger('pydrop')


@blueprint.route('/')
@blueprint.route('/index')
def index():
    # Route to serve the upload form
    return render_template('index.html',
                           page_name='Main',
                           project_name="pydrop")


@blueprint.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']

    save_path = os.path.join(config.data_dir, secure_filename(file.filename))
    current_chunk = int(request.form['dzchunkindex'])

    # If the file already exists it's ok if we are appending to it,
    # but not if it's new file that would overwrite the existing one
    if os.path.exists(save_path) and current_chunk == 0:
        # 400 and 500s will tell dropzone that an error occurred and show an error
        return make_response(('File already exists', 400))

    try:
        with open(save_path, 'ab') as f:
            f.seek(int(request.form['dzchunkbyteoffset']))
            f.write(file.stream.read())
    except OSError:
        # log.exception will include the traceback so we can see what's wrong 
        log.exception('Could not write to file')
        return make_response(("Not sure why,"
                              " but we couldn't write the file to disk", 500))

    total_chunks = int(request.form['dztotalchunkcount'])

    if current_chunk + 1 == total_chunks:
        # This was the last chunk, the file should be complete and the size we expect
        if os.path.getsize(save_path) != int(request.form['dztotalfilesize']):
            log.error(f"File {file.filename} was completed, "
                      f"but has a size mismatch."
                      f"Was {os.path.getsize(save_path)} but we"
                      f" expected {request.form['dztotalfilesize']} ")
            return make_response(('Size mismatch', 500))
        else:
            log.info(f'File {file.filename} has been uploaded successfully')
    else:
        log.debug(f'Chunk {current_chunk + 1} of {total_chunks} '
                  f'for file {file.filename} complete')

    return make_response(("Chunk upload successful", 200))

【讨论】:

  • +1 提出了一个比到处建议的“提高上限并增加超时”更合理的解决方案
猜你喜欢
  • 2011-08-14
  • 1970-01-01
  • 2011-06-18
  • 2019-04-23
  • 2021-06-09
  • 1970-01-01
  • 1970-01-01
  • 2020-07-20
  • 1970-01-01
相关资源
最近更新 更多