使用 asyncio.gather 时正确捕获 aiohttp TimeoutError答案

【问题标题】：Correctly catch aiohttp TimeoutError when using asyncio.gather使用 asyncio.gather 时正确捕获 aiohttp TimeoutError
【发布时间】：2020-04-01 11:34:26
【问题描述】：

这是我在 Stack Overflow 上的第一个问题，所以如果我做了一些愚蠢的事情或遗漏了什么，我深表歉意。

我正在尝试一次向多个 api 端点发出异步 aiohttp GET 请求以检查这些页面的状态：结果应该是表单的三倍 (url, True, "200") 在有效链接的情况下和 (url, False，response_status）在“有问题的链接”的情况下。这是每次调用的原子函数：

async def ping_url(url, session, headers, endpoint):

    try:
        async with session.get((url + endpoint), timeout=5, headers=headers) as response:
            return url, (response.status == 200), str(response.status)
    except Exception as e:
        test_logger.info(url + ": " + e.__class__.__name__)
        return url, False, repr(e)

使用 asyncio.gather() 将它们包装到一个函数中，该函数还创建 aiohttp 会话：

async def ping_urls(urllist, endpoint):

   headers = ... # not relevant

   async with ClientSession() as session:
        try:
            results = await asyncio.gather(*[ping_url(url, session, headers, endpoint) \
                      for url in urllist],return_exceptions=True)
        except Exception as e:
            print(repr(e))
   return results

从如下所示的 main 调用的整体：

    urls = ... # not relevant
    loop = asyncio.get_event_loop()
    try:
        loop.run_until_complete(ping_urls(urls, endpoint))

    except Exception as e:
        pass
    finally:
        loop.close()

这在大多数情况下都有效，但如果列表很长，我一拿到一个就注意到了

TimeoutError

执行循环停止，在第一个超时后，我得到所有其他 url 的 TimeoutError。如果我在最里面的函数中省略超时，我会得到更好的结果，但它不再那么快了。有没有办法控制单个 api 调用的超时而不是整个 url 列表的大的一般超时？

任何形式的帮助都将不胜感激，因为这个问题，我的学士论文被卡住了。

【问题讨论】：

标签： python concurrency aiohttp python-asyncio

【解决方案1】：

您可能想尝试为您的客户端会话设置会话超时。这可以像这样完成：

async def ping_urls(urllist, endpoint):
    headers = ... # not relevant

    timeout = ClientTimeout(total=TIMEOUT_SECONDS)
    async with ClientSession(timeout=timeout) as session:
        try:
            results = await asyncio.gather(
               *[
                    ping_url(url, session, headers, endpoint)
                    for url in urllist
                ],
                return_exceptions=True
            )
        except Exception as e:
            print(repr(e))

        return results

这应该将 ClientSession 实例设置为将TIMEOUT_SECONDS 作为超时。显然，您需要将该值设置为适当的值！

【讨论】：