从另一个线程关闭“Python 请求”连接答案

【问题标题】：Closing "Python Requests" connection from another thread从另一个线程关闭“Python 请求”连接
【发布时间】：2013-05-05 23:37:04
【问题描述】：

为了尽快关闭应用程序，我可以中断来自另一个线程的 requests.post 调用并让它立即终止连接吗？

我玩过适配器，但到目前为止没有运气：

for ad in self.client.session.adapters.values():
    ad.close()

【问题讨论】：

标签： python python-requests

【解决方案1】：

正确的方法是使用消息传递到另一个线程。我们可以通过使用共享全局变量来做一个穷人版本。例如，您可以尝试运行此脚本：

#!/usr/bin/env python
# A test script to verify that you can abort streaming downloads of large
# files.
import threading
import time
import requests

stop_download = False

def download(url):
    r = requests.get(url, stream=True)
    data = ''
    content_gen = r.iter_content()

    while (stop_download == False):
        try:
            data = r.iter_content(1024)
        except StopIteration:
            break

    if (stop_download == True):
        print 'Killed from other thread!'
        r.close()

if __name__ == '__main__':
    t = threading.Thread(target=download, 
                         args=('http://ftp.freebsd.org/pub/FreeBSD/ISO-IMAGES-amd64/9.1/FreeBSD-9.1-RELEASE-amd64-dvd1.iso',)
                        ).start()
    time.sleep(5)
    stop_download = True
    time.sleep(5) # Just to make sure you believe that the message actually stopped the other thread.

在生产环境中执行此操作时，尤其是在您没有 GIL 保护的情况下，您需要更加谨慎地处理消息传递状态，以避免出现尴尬的多线程错误。我将把它留给实施者。

【讨论】：

此外，您可以使用非常小的内容大小来更好地控制何时取消。而且您的content_gen = r.iter_content() 行是不必要的；）
iter_content() 返回一个迭代器（或生成器），所以你真的应该做类似content_gen = r.iter_content(1024) 后跟多个next(content_gen)。
@Lukasa ，你能告诉我当r = requests.get(url, stream=True) 行被执行时，响应数据会全部下载到内存中的某个地方吗？因为我不确定data = r.iter_content(1024) 的1024 字节数据从何而来。我想知道如果是这种情况，那么稍后终止请求是没有用的，因为数据已经下载（请求/响应结束）。
没有。 stream=True 的意思是，一旦请求包含所有标头，请求就会停止从操作系统中提取数据。其余数据位于 OS 套接字缓冲区中，如果您不从中读取，最终会被 TCP 减慢和停止。在迭代 iter_content 中的某些数据之前，内存中没有其余数据。

【解决方案2】：

我找到了方法，这里是中断连接的方法

def close():
    time.sleep(5)
    r.raw._fp.close()

t = threading.Thread(target=close).start()
print "getting"
s = requests.Session()
r = s.get("http://download.thinkbroadband.com/1GB.zip", stream = True)
for line in r.iter_content(1024):
    log.debug("got it: %s", len(line))
print "done"

不过这是一个hack，我不喜欢它，私人成员将来可以改变，我回到urllib2

【讨论】：

这几天还有Response.close，似乎也做了同样的事情。
当您的线程在从流中读取时被阻塞并且您想取消读取时，此方法很有用。在阅读器中止之前调用 close 后似乎有几秒钟的超时。

【解决方案3】：

因此，如果您从交互式 shell 执行以下操作，您会发现关闭适配器似乎并没有达到您想要的效果。

import requests
s = requests.session()
s.close()
s.get('http://httpbin.org/get')
<Response [200]>
for _, adapter in s.adapters.items():
    adapter.close()

s.get('http://httpbin.org/get')
<Response [200]>
s.get('https://httpbin.org/get')
<Response [200]>

这看起来可能是请求中的错误，但总的来说，关闭适配器应该会阻止您发出进一步的请求，但我不完全确定它会中断当前正在运行的请求。

查看 HTTPAdapter（它支持标准的 'http://' 和 'https://' 适配器），对其调用 close 将在底层 urrllib3 PoolManager 上调用 clear。从 urllib3 的该方法的文档中您可以看到：

This will not affect in-flight connections, but they will not be
re-used after completion.

因此，从本质上讲，您会看到您无法影响尚未完成的连接。

【讨论】：

是的，不幸的是请求不是这样设计的，我猜小型http请求是主要目的。我将它用于桌面应用程序，客户端不得不被打断。
您可能能够深入挖掘 urllib3 并强行关闭套接字，但 urllib3 位于 httplib 之上，我不知道这是否可能。如果您想要那种细粒度的控制，您可以只使用原始套接字自己构建整个请求（并解析响应）。当用户想要取消某些东西时，您只需关闭套接字。不过，这是很多工作。