【问题标题】:Python, Catch timeout during stream requestPython,流请求期间的捕获超时
【发布时间】:2014-01-20 13:31:31
【问题描述】:

我正在使用请求库读取 XML 事件,如下面的代码所述。请求启动后如何引发连接丢失错误?服务器正在模拟 HTTP 推送/长轮询 -> http://en.wikipedia.org/wiki/Push_technology#Long_polling,默认情况下不会结束。 如果 10 分钟后没有新消息,则应退出 while 循环。

import requests
from time import time


if __name__ == '__main__':
    #: Set a default content-length
    content_length = 512
    try:
        requests_stream = requests.get('http://agent.mtconnect.org:80/sample?interval=0', stream=True, timeout=2)
        while True:
            start_time = time()
            #: Read three lines to determine the content-length         
            for line in requests_stream.iter_lines(3, decode_unicode=None):
                if line.startswith('Content-length'):
                    content_length = int(''.join(x for x in line if x.isdigit()))
                    #: pause the generator
                    break

            #: Continue the generator and read the exact amount of the body.        
            for xml in requests_stream.iter_content(content_length):
                print "Received XML document with content length of %s in %s seconds" % (len(xml), time() - start_time)
                break

    except requests.exceptions.RequestException as e:
        print('error: ', e)

可以通过命令行使用 curl 测试服务器推送:

curl http://agent.mtconnect.org:80/sample\?interval\=0

【问题讨论】:

    标签: python python-requests urllib3


    【解决方案1】:

    这可能不是最好的方法,但您可以使用多处理在单独的进程中运行请求。 这样的事情应该可以工作:

    import multiprocessing
    import requests
    import time
    
    class RequestClient(multiprocessing.Process):
        def run(self):
            # Write all your code to process the requests here
            content_length = 512
            try:
                requests_stream = requests.get('http://agent.mtconnect.org:80/sample?interval=0', stream=True, timeout=2)
    
                start_time = time.time()
                for line in requests_stream.iter_lines(3, decode_unicode=None):
                    if line.startswith('Content-length'):
                        content_length = int(''.join(x for x in line if x.isdigit()))
                        break
    
                for xml in requests_stream.iter_content(content_length):
                    print "Received XML document with content length of %s in %s seconds" % (len(xml), time.time() - start_time) 
                    break
            except requests.exceptions.RequestException as e:
                print('error: ', e)
    
    
    While True:
        childProcess = RequestClient()
        childProcess.start()
    
        # Wait for 10mins
        start_time = time.time()
        while time.time() - start_time <= 600:
            # Check if the process is still active
            if not childProcess.is_alive():
                # Request completed
                break
            time.sleep(5)    # Give the system some breathing time
    
        # Check if the process is still active after 10mins.
        if childProcess.is_alive():
            # Shutdown the process
            childProcess.terminate()
            raise RuntimeError("Connection Timed-out")
    

    不是解决您问题的完美代码,但您明白了。

    【讨论】:

    • 嗯,看来可以了。但是,我每 5 秒只收到一条 XML 消息。我需要尽快获得这些;)
    • 5秒睡眠实际上并没有挂起子进程。它只是让主线程休眠。 XML 消息应在子进程中返回后立即进行处理。很可能,服务器或 requests 模块正在添加 5 秒延迟。
    • 如果成功了,你可能会继续接受答案:)
    • 成功了,是的。但拥有更多线程和进程并不是理想的解决方案。我认为有一个方法/函数可以在循环中使用。 :)
    • 如果requests_stream.iter_lines是一个阻塞调用(它可能是),那么没有其他方法可以做到这一点,因为在等待数据时不会调用循环中的超时函数。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2018-11-30
    • 1970-01-01
    • 2019-11-21
    • 1970-01-01
    • 2012-07-21
    • 1970-01-01
    • 2022-11-02
    相关资源
    最近更新 更多