如何批量发送包含多个 url 的多部分 html 帖子？答案

【问题标题】：How do I batch send a multipart html post with multiple urls?如何批量发送包含多个 url 的多部分 html 帖子？
【发布时间】：2015-02-23 09:35:36
【问题描述】：

我正在与 gmail api 通话，并希望批量处理请求。他们在这里有一个友好的指南，https://developers.google.com/gmail/api/guides/batch，这表明我应该能够使用 multipart/mixed 并包含不同的 url。

我正在使用 Python 和 Requests 库，但不确定如何发布不同的 url。像How to send a "multipart/form-data" with requests in python? 这样的答案没有提到更改该部分的选项。

我该怎么做？

【问题讨论】：

标签： python python-requests multipartform-data gmail-api

【解决方案1】：

很遗憾，requests 在其 API 中不支持 multipart/mixed。这已在几个 GitHub 问题（#935 和 #1081）中提出，但目前没有更新。如果您在 requests 源中搜索“混合”并获得零结果，这也会变得非常清楚:(

现在您有多种选择，具体取决于您希望使用 Python 和 3rd-party 库的程度。

谷歌 API 客户端

现在，这个问题最明显的答案是使用 Google 提供的官方 Python API here。它带有一个HttpBatchRequest 类，可以处理您需要的批处理请求。这在this guide中有详细记录。

基本上，您创建一个HttpBatchRequest 对象并将所有请求添加到它。然后图书馆将把所有东西放在一起（取自上面的指南）：

batch = BatchHttpRequest()
batch.add(service.animals().list(), callback=list_animals)
batch.add(service.farmers().list(), callback=list_farmers)
batch.execute(http=http)

现在，如果由于某种原因您不能或不会使用官方的 Google 库，您将不得不自己构建请求正文的一部分。

requests + email.mime

正如我已经提到的，请求不正式支持multipart/mixed。但这并不意味着我们不能“强迫”它。在创建Request对象时，我们可以使用files参数来提供多部分数据。

files 是一个接受以下格式的 4 元组值的字典：（文件名、文件对象、内容类型、标题）。文件名可以为空。现在我们需要将Request 对象转换为文件（-like）对象。我编写了一个小方法，涵盖了 Google 示例中的基本示例。它的部分灵感来自 Google 在其 Python 库中使用的内部方法：

import requests
from email.mime.multipart import MIMEMultipart
from email.mime.nonmultipart import MIMENonMultipart

BASE_URL = 'http://www.googleapis.com/batch'

def serialize_request(request):
    '''Returns the string representation of the request'''
    mime_body = ''

    prepared = request.prepare()

    # write first line (method + uri)
    if request.url.startswith(BASE_URL):
        mime_body = '%s %s\r\n' % (request.method, request.url[len(BASE_URL):])
    else:
        mime_body = '%s %s\r\n' % (request.method, request.url)

    part = MIMENonMultipart('application', 'http')

    # write headers (if possible)
    for key, value in prepared.headers.iteritems():
        mime_body += '%s: %s\r\n' % (key, value)

    if getattr(prepared, 'body', None) is not None:
        mime_body += '\r\n' + prepared.body + '\r\n'

    return mime_body.encode('utf-8').lstrip()

此方法会将 requests.Request 对象转换为 UTF-8 编码字符串，该字符串稍后可用作 MIMENonMultipart 对象的有效负载，即不同的多部分。

现在，为了生成实际的批处理请求，我们首先需要将（Google API）请求列表压缩到 requests 库的 files 字典中。以下方法将获取 requests.Request 对象列表，将每个对象转换为 MIMENonMultipart，然后返回符合 files 字典结构的字典：

import uuid

def prepare_requests(request_list):
    message = MIMEMultipart('mixed')
    output = {}

    # thanks, Google. (Prevents the writing of MIME headers we dont need)
    setattr(message, '_write_headers', lambda self: None)

    for request in request_list:
        message_id = new_id()
        sub_message = MIMENonMultipart('application', 'http')
        sub_message['Content-ID'] = message_id
        del sub_message['MIME-Version']

        sub_message.set_payload(serialize_request(request))

        # remove first line (from ...)
        sub_message = str(sub_message)
        sub_message = sub_message[sub_message.find('\n'):]

        output[message_id] = ('', str(sub_message), 'application/http', {})

    return output

def new_id():
    # I am not sure how these work exactly, so you will have to adapt this code
    return '<item%s:12930812@barnyard.example.com>' % str(uuid.uuid4())[-4:]

最后，我们需要将 Content-Type 从 multipart/form-data 更改为 multipart/mixed 并从每个请求部分。这些我们由 requests 生成，不能被 files 字典覆盖。

import re

def finalize_request(prepared):
    # change to multipart/mixed
    old = prepared.headers['Content-Type']
    prepared.headers['Content-Type'] = old.replace('multipart/form-data', 'multipart/mixed')

    # remove headers at the start of each boundary
    prepared.body = re.sub(r'\r\nContent-Disposition: form-data; name=.+\r\nContent-Type: application/http\r\n', '', prepared.body)

我已尽力使用批处理指南中的 Google 示例对此进行测试：

sheep = {
  "animalName": "sheep",
  "animalAge": "5",
  "peltColor": "green"
}

commands = []
commands.append(requests.Request('GET', 'http://www.googleapis.com/batch/farm/v1/animals/pony'))
commands.append(requests.Request('PUT', 'http://www.googleapis.com/batch/farm/v1/animals/sheep', json=sheep, headers={'If-Match': '"etag/sheep"'}))
commands.append(requests.Request('GET', 'http://www.googleapis.com/batch/farm/v1/animals', headers={'If-None-Match': '"etag/animals"'}))

files = prepare_requests(commands)

r = requests.Request('POST', 'http://www.googleapis.com/batch', files=files)
prepared = r.prepare()

finalize_request(prepared)

s = requests.Session()
s.send(prepared)

生成的请求应该与 Google 在其批处理指南中提供的内容足够接近：

POST http://www.googleapis.com/batch
Content-Length: 1006
Content-Type: multipart/mixed; boundary=a21beebd15b74be89539b137bbbc7293

--a21beebd15b74be89539b137bbbc7293

Content-Type: application/http
Content-ID: <item8065:12930812@barnyard.example.com>

GET /farm/v1/animals
If-None-Match: "etag/animals"

--a21beebd15b74be89539b137bbbc7293

Content-Type: application/http
Content-ID: <item5158:12930812@barnyard.example.com>

GET /farm/v1/animals/pony

--a21beebd15b74be89539b137bbbc7293

Content-Type: application/http
Content-ID: <item0ec9:12930812@barnyard.example.com>

PUT /farm/v1/animals/sheep
Content-Length: 63
Content-Type: application/json
If-Match: "etag/sheep"

{"animalAge": "5", "animalName": "sheep", "peltColor": "green"}

--a21beebd15b74be89539b137bbbc7293--

最后，我强烈推荐谷歌官方图书馆，但如果你不能使用它，你将不得不即兴发挥一点:)

免责声明：我实际上没有尝试将此请求发送到 Google API 端点，因为身份验证过程太麻烦了。我只是想尽可能接近批处理指南中描述的 HTTP 请求。 \r 和 \n 行尾可能存在一些问题，具体取决于 Google Endpoints 的严格程度。

来源：

【讨论】：

什么概述，完美！