使用 Python 抓取动态数据答案

【问题标题】：Scraping dynamic data with Python使用 Python 抓取动态数据
【发布时间】：2018-05-12 00:46:50
【问题描述】：

我在 html 上有这个简单的页面：

<html>
  <body>
    <p>Javascript (dynamic data) test:</p>
    <p class='jstest' id='yesnojs'>Hello</p>

    <button onclick="myFunction()">Try it</button>

    <script>
      function myFunction() {
        document.getElementById('yesnojs').innerHTML = 'GoodBye';
      }
    </script>
  </body>
</html>

我现在想使用 Python 废弃此页面，以获取 ID“yesnojs”何时为“GoodBye”，我的意思是，当用户单击按钮时。我一直在尝试一些教程，但我总是得到“你好”，它不在乎我是否点击了，我正在查看“再见”页面。

希望大家帮忙，谢谢。

PD：这是我在 Python 上尝试抓取页面的代码：

from selenium import webdriver

chrome_path=
"C:\\Users\\Antonio\\Downloads\\chromedriver_win32\\chromedriver.exe"

driver = webdriver.Chrome(chrome_path)

driver.get("http://localhost/templates/scraping.html")
review = driver.find_elements_by_class_name("jstest")
for post in review:
    print(post.text)

【问题讨论】：

你的抓取代码在哪里？
document.getElementById('yesnojs').innerHTML 驻留在 RAM 中。因此，我不明白如何通过抓取 html 文件来获得更改的值。我还没做过，所以这更像是一个问题。
上传Python代码

标签： python html selenium web-scraping

【解决方案1】：

Selenium 不会附加到您现有的打开网页。它会打开一个新网页。如果您正在设计单元测试，则必须使用 Selenium 模拟点击。

或者，您是否正在考虑制作一个浏览器扩展来在此事件发生时进行抓取，Selenium 不是用于此的工具。

【讨论】：