【问题标题】:How to get a table with dynamic id using Selenium with Python如何使用 Selenium 和 Python 获取具有动态 id 的表
【发布时间】:2020-10-18 16:40:14
【问题描述】:
【问题讨论】:
标签:
python
selenium
xpath
web-scraping
css-selectors
【解决方案1】:
表WebElement 是AJAX 元素,因此要打印您必须为visibility_of_element_located() 诱导WebDriverWait 的值,您可以使用以下Locator Strategies 之一:
-
使用CSS_SELECTOR:
driver.get('https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.tablesaw.tablesaw-stack.table-bordered.table-centered.rates-availability-table"))).text)
-
使用XPATH:
driver.get('https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='tablesaw tablesaw-stack table-bordered table-centered rates-availability-table']"))).text)
-
控制台输出:
Start Date End Date 3 Nights 4 Nights 5 Nights 6 Nights 7 Nights
28 Mar 2020 1 May 2020 £225 £300 £350 £410 £470
2 May 2020 26 Jun 2020 £250 £330 £400 £460 £530
27 Jun 2020 3 Jul 2020 - - - - £675
4 Jul 2020 10 Jul 2020 - - - - £920
11 Jul 2020 14 Aug 2020 - - - - £985
15 Aug 2020 21 Aug 2020 - - - - £920
22 Aug 2020 28 Aug 2020 - - - - £675
29 Aug 2020 31 Oct 2020 - - - - £470
-
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
【解决方案2】:
数据通过 JavaScript 动态加载。但是您可以使用他们的 API 来加载表格。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm'
rates_url = 'https://www.holidayfrancedirect.co.uk/api/property-rates/{property_id}/2020'
property_id = url.split('/')[-2]
data = requests.get(rates_url.format(property_id=property_id)).json()
soup = BeautifulSoup(data['ratesHtml'], 'html.parser')
# print table to screen:
for tr in soup.select('tr'):
tds = [td.get_text(strip=True) for td in tr.select('td, th')]
print(('{:<15}'*7).format(*tds))
打印:
Start Date End Date 3 Nights 4 Nights 5 Nights 6 Nights 7 Nights
28 Mar 2020 1 May 2020 £225 £300 £350 £410 £470
2 May 2020 26 Jun 2020 £250 £330 £400 £460 £530
27 Jun 2020 3 Jul 2020 - - - - £675
4 Jul 2020 10 Jul 2020 - - - - £920
11 Jul 2020 14 Aug 2020 - - - - £985
15 Aug 2020 21 Aug 2020 - - - - £920
22 Aug 2020 28 Aug 2020 - - - - £675
29 Aug 2020 31 Oct 2020 - - - - £470