【问题标题】：Is there an alternative to parse_qs that handles semi-colons?是否有处理分号的 parse_qs 的替代方法？
【发布时间】：2014-01-21 12:20:00
【问题描述】：

TL;DR

哪些库/调用可用于处理包含与 parse_qs 不同的分号的查询字符串？

>>> urlparse.parse_qs("tagged=python;ruby")
>>> {'tagged': ['python']}

完整背景

我正在使用 StackExchange API 来搜索标记的问题。

Search 的布局是这样的，标签用分号隔开：

/2.1/search?order=desc&sort=activity&tagged=python;ruby&site=stackoverflow

与 API 交互就可以了。当我想测试调用时，问题就出现了，特别是在使用httpretty 模拟 HTTP 时。

在后台，httpretty 使用 python 标准库中的urlparse.parse_qs 来解析查询字符串。

>>> urlparse.parse_qs("tagged=python;ruby")
{'tagged': ['python']}

显然这并不好。这是一个小例子，这里是 httpretty 的 sn-p（在测试上下文之外）。

import requests
import httpretty

httpretty.enable()

httpretty.register_uri(httpretty.GET, "https://api.stackexchange.com/2.1/search", body='{"items":[]}')
resp = requests.get("https://api.stackexchange.com/2.1/search", params={"tagged":"python;ruby"})
httpretty_request = httpretty.last_request()
print(httpretty_request.querystring)

httpretty.disable()
httpretty.reset()

我想使用 httpretty 的机器，但需要解决 parse_qs 的问题。我现在可以修补 httpretty，但很想看看还能做些什么。

【问题讨论】：

不幸的是';'它在 urlparse 中被硬编码为分隔符。请参阅：hg.python.org/cpython/file/2.7/Lib/urlparse.py#l150-157 并且无法通过参数覆盖它。
哦，嘿，感谢您链接到源代码。看起来它实际上是在parse_qsl 中硬编码的。

标签： python http mocking stackexchange httpretty

【解决方案1】：

为了解决这个问题，我临时修补了httpretty.core.unquote_utf8（技术上是httpretty.compat.unquote_utf8）。

#
# To get around how parse_qs works (urlparse, under the hood of
# httpretty), we'll leave the semi colon quoted.
# 
# See https://github.com/gabrielfalcao/HTTPretty/issues/134
orig_unquote = httpretty.core.unquote_utf8
httpretty.core.unquote_utf8 = (lambda x: x)

# It should handle tags as a list
httpretty.register_uri(httpretty.GET,
                       "https://api.stackexchange.com/2.1/search",
                       body=param_check_callback({'tagged': 'python;dog'}))
search_questions(since=since, tags=["python", "dog"], site="pets")

...

# Back to normal for the rest
httpretty.core.unquote_utf8 = orig_unquote
# Test the test by making sure this is back to normal
assert httpretty.core.unquote_utf8("%3B") == ";"

这假设您不需要任何其他未引用的内容。另一种选择是在分号到达parse_qs之前只保留百分比编码。

【讨论】：