在“tbody”中仅打印一个“tr”标签 - Beautifulsoup答案

【问题标题】：print only one 'tr' tag in 'tbody' - Beautifulsoup在“tbody”中仅打印一个“tr”标签 - Beautifulsoup
【发布时间】：2020-07-11 08:17:40
【问题描述】：

我试图在“tbody”中仅打印一个“tr”标签的内容。我用这段代码打印了 'tbody' 中的所有 'tr'，但 Python 没有在 Berlin 之后打印出我的 'tr'。我使用了这个网址：https://interaktiv.morgenpost.de/corona-virus-karte-infektionen-deutschland-weltweit/?fbclid=IwAR0xb7zTV0vstu-sLE3ByHZVSw89HyqjSwMhpfXT23RwcFqR57za2J_l7XQ。这是我要完整打印的表格：https://i.stack.imgur.com/i869g.png

from bs4 import BeautifulSoup
from selenium import webdriver 


browser = webdriver.Chrome()
url = "https://interaktiv.morgenpost.de/corona-virus-karte-infektionen-deutschland-weltweit/?fbclid=IwAR0xb7zTV0vstu-sLE3ByHZVSw89HyqjSwMhpfXT23RwcFqR57za2J_l7XQ"
browser.get(url)
soup = BeautifulSoup(browser.page_source, "html.parser")



allStat = {}

table_body = soup.find('tbody')
table_rows = table_body.find_all('tr')

for i in table_rows:
    region = i.find('td', class_ = 'region').get_text()
    confirmed = i.find('td', class_ = 'confirmed').get_text()
    deaths = i.find('td', class_= 'deaths' ).get_text()

    allStat.update({region: [confirmed,deaths]})
print(allStat)

【问题讨论】：

能否请您等待元素位于HTML source 中？否则使用内置的time 库。 check，我也对这个问题投了反对票，因为它是完全重复的。
这能回答你的问题吗？ Selenium Webdriver: (python) wait for element to not be present (not working)
我不明白如何让它像我提到的那样工作

标签： python selenium web beautifulsoup screen-scraping

【解决方案1】：

from selenium import webdriver
import pandas as pd
from selenium.webdriver.firefox.options import Options

options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)

driver.get(
    "https://interaktiv.morgenpost.de/corona-virus-karte-infektionen-deutschland-weltweit/")

btn = driver.find_element_by_css_selector(
    "button.btn.fnktable__expand").click()

df = pd.read_html(driver.page_source)[0]
df.to_csv("data.csv", index=False)

driver.quit()

【讨论】：