【问题标题】:selenium in python - one timeout causes all subsequent requests to timeoutpython中的selenium - 一次超时导致所有后续请求超时
【发布时间】:2019-02-18 19:34:56
【问题描述】:

Chrome 驱动版本:2.41 Chrome 版本:69.0.3497.92

这是我的代码向一个 webdriver 发送多个请求并进行异常处理:

from selenium import webdriver
from selenium.common.exceptions import *

options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')

driver = webdriver.Chrome('/usr/local/bin/chromedriver', chrome_options=options)
driver.set_page_load_timeout(30)

for link in links:
    try:
        driver.get(link)
    except TimeoutException as e:
        # do something
        continue
    except Exception as e:
        # do some other thing
        continue

预期的行为是,如果抛出 TimeoutException,我将继续向下一个链接发出请求,依此类推。但是,我得到的是,当一个 TimeoutException 发生时,所有其余的链接也会抛出 TimeoutExceptions。

这是来自 chrome 记录器的相关日志。

[1536872569.507][SEVERE]: Timed out receiving message from renderer: 29.449
[1536872569.509][INFO]: Timed out. Stopping navigation...
[1536872569.509][DEBUG]: DEVTOOLS COMMAND Page.stopLoading (id=1243) {

}
[1536872569.509][DEBUG]: DEVTOOLS RESPONSE Page.stopLoading (id=1243) {

}
[1536872569.509][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=1244) {
   "expression": "1"
}
[1536872569.510][SEVERE]: Timed out receiving message from renderer: -0.002
[1536872569.513][INFO]: Done waiting for pending navigations. Status: timeout
[1536872569.513][INFO]: RESPONSE Navigate timeout
  (Session info: headless chrome=69.0.3497.92)
[1536872569.516][INFO]: COMMAND Navigate {
  "sessionId": "9caf0bad68147065f14c9c22632cd6d8",
   "url": "www.example.com"
}
[1536872569.516][DEBUG]: DEVTOOLS EVENT Page.frameStoppedLoading {
   "frameId": "620369B66F0605C0CE359F34F9D95E36"
}
[1536872569.516][DEBUG]: DEVTOOLS RESPONSE Runtime.evaluate (id=1244) {
   "result": {
      "description": "1",
      "type": "number",
      "value": 1
   }
}
[1536872569.516][INFO]: Waiting for pending navigations...
[1536872569.516][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=1245) {
   "expression": "1"
}
[1536872569.517][DEBUG]: DEVTOOLS RESPONSE Runtime.evaluate (id=1245) {
   "result": {
      "description": "1",
      "type": "number",
      "value": 1
   }
}
[1536872599.516][SEVERE]: Timed out receiving message from renderer: 30.000
[1536872599.518][INFO]: Timed out. Stopping navigation...
[1536872599.518][DEBUG]: DEVTOOLS COMMAND Page.stopLoading (id=1246) {

}
[1536872599.518][DEBUG]: DEVTOOLS RESPONSE Page.stopLoading (id=1246) {

}
[1536872599.518][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=1247) {
   "expression": "1"
}
[1536872599.518][SEVERE]: Timed out receiving message from renderer: -0.002
[1536872599.522][INFO]: Done waiting for pending navigations. Status: timeout
[1536872599.522][INFO]: RESPONSE Navigate timeout
  (Session info: headless chrome=69.0.3497.92)
[1536872599.524][INFO]: COMMAND Navigate {
   "sessionId": "9caf0bad68147065f14c9c22632cd6d8",
   "url": "www.example2.com"
}

以下是我将此事件与其他无异常完成的后续请求进行比较时发现的差异。

1) DEVTOOLS EVENT Page.frameStoppedLoading 在向新的“www.example.com”链接发送请求后立即出现。

2) 从上一个链接发送的对DEVTOOLS COMMAND Runtime.evaluate (id=1244) 的响应会在对新 URL 的请求之后记录。

问题:除了使用每个 TimeoutException 重新启动驱动程序之外,还有其他方法可以处理此问题吗?

如果有人也能详细说明这种行为,我将不胜感激。谢谢。

【问题讨论】:

    标签: python selenium google-chrome selenium-webdriver google-chrome-devtools


    【解决方案1】:

    更新:

    通过进一步阅读日志,我意识到立即尝试发送另一个请求会导致请求根本没有发送。我在原始帖子中提出的 2 个观察结果是在请求成功时出现的,因此您可以忽略它。

    这是成功的连续请求日志与超时异常处理后的连续请求日志的比较。

    当 chrome 驱动启动时,浏览器会话会得到一个 id(后来称为 frameId)。

       [1536915601.693][DEBUG]: DevTools request: http://localhost:34899/json
       [1536915601.694][DEBUG]: DevTools response: [ {
          "description": "",
          "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:34899/devtools/page/A417CC5AE2C87A4D0FC64CF66B54ED72",
          "id": "A417CC5AE2C87A4D0FC64CF66B54ED72",
          "title": "data:,",
          "type": "page",
          "url": "data:,",
          "webSocketDebuggerUrl": "ws://localhost:34899/devtools/page/A417CC5AE2C87A4D0FC64CF66B54ED72"
       } ]
    


    现在案例1:成功响应后的正常请求:

      [1536915607.033][INFO]: Done waiting for pending navigations. Status: ok
      [1536915607.033][INFO]: RESPONSE GetSource "\u003C!DOCTYPE html>\u003Chtml xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"ko\">\u003Chead>\u003Cmeta http-equiv=\"Content-Type\" content=\"text/h       tml; charset=utf-8\" />\n\u003Cmeta name=\"viewport\" content=\"width=device-width, in..."
      [1536915607.044][INFO]: COMMAND Navigate {
         "sessionId": "d11fb86ec1b49a141f99fe1ec4286a85",
         "url": "http://www.gelloy.com/product/detail.html?product_no=438&cate_no=30&display_group=1"
      } 
     # ------ skip for concisiveness ----- #
     [1536915607.044][INFO]: Done waiting for pending navigations. Status: ok
      [1536915607.044][DEBUG]: DEVTOOLS COMMAND Page.navigate (id=49) {
         "url": "http://www.gelloy.com/product/detail.html?product_no=438&cate_no=30&display_group=1"
      }
      [1536915609.244][DEBUG]: DEVTOOLS RESPONSE Page.navigate (id=49) {
         "frameId": "A417CC5AE2C87A4D0FC64CF66B54ED72",
         "loaderId": "0EB53CDA615428AA73A9DB67F5FF65E1"
      }
    

    在这里,我可以看到
    - COMMAND Navigate - 准备下一个请求
    - COMMAND Page.navigate - 发出请求
    - RESPONSE Page.navigate - 以开头给出的 frameId 返回

    对比

    案例2:在触发超时后立即发送请求:

      [1536872569.513][INFO]: Done waiting for pending navigations. Status: timeout
      [1536872569.513][INFO]: RESPONSE Navigate timeout
      (Session info: headless chrome=69.0.3497.92)
      [1536872569.516][INFO]: COMMAND Navigate {
      "sessionId": "9caf0bad68147065f14c9c22632cd6d8",
       "url": "www.example.com"
      }
      [1536872569.516][DEBUG]: DEVTOOLS EVENT Page.frameStoppedLoading {
       "frameId": "620369B66F0605C0CE359F34F9D95E36"
      }
      [1536872569.516][DEBUG]: DEVTOOLS RESPONSE Runtime.evaluate (id=1244) {
       "result": {
          "description": "1",
          "type": "number",
          "value": 1
       }
      }
      [1536872569.516][INFO]: Waiting for pending navigations...
      [1536872569.516][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=1245) {
       "expression": "1"
      }
      [1536872569.517][DEBUG]: DEVTOOLS RESPONSE Runtime.evaluate (id=1245) {
       "result": {
          "description": "1",
          "type": "number",
          "value": 1
       }
      }
    [1536872599.516][SEVERE]: Timed out receiving message from renderer: 30.000
    

    但是,在超时之后,我看到 COMMAND Navigate 带有下一个要获取的 URL,但 COMMAND Page.navigate 永远不会发生。因此,当COMMAND Navigate 创建后的 30 秒后,驱动程序会根据最新的RESPONSE Page.navigate 的结果确定页面是否已加载,这将导致此后超时。


    解决方案

    我决定用driver.quit() 关闭驱动程序,并在每次发生超时异常时重新打开一个新的浏览器。在继续循环之前输入time.sleep(1) 似乎也有效,但我不能确定 1 秒是否足够。

    这是我更新后的代码:

    driver = webdriver.Chrome('/usr/local/bin/chromedriver', chrome_options=options)
    driver.set_page_load_timeout(30)
    
    for link in links:
        try:
            driver.get(link)
        except TimeoutException as e:
            # do something
            driver.quit()
            driver = webdriver.Chrome('/usr/local/bin/chromedriver', chrome_options=options)
            driver.set_page_load_timeout(30)           
            continue
        except Exception as e:
            # do some other thing
            continue
    

    【讨论】:

      猜你喜欢
      • 2017-07-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-01-24
      • 1970-01-01
      • 2011-02-26
      • 1970-01-01
      • 2013-02-19
      相关资源
      最近更新 更多