【发布时间】:2021-08-26 23:35:42
【问题描述】:
我正在尝试通过抓取网页来创建天气预报。 (我以前的question)
我的代码:
import time
import requests
from selenium import webdriver
from bs4 import BeautifulSoup
from keyboard import press_and_release
def weather_forecast2():
print('Hello, I can search up the weather for you.')
while True:
inp = input('Where shall I search? Enter a place :').capitalize()
print('Alright, checking the weather in ' + inp + '...')
URL = 'https://www.yr.no/nb'
"Search for a place"
driver = webdriver.Edge() # Open Microsoft Edge
driver.get(URL) # Goes to the HTML-page of the given URL
element = driver.find_element_by_id("søk") # Find the search input box
element.send_keys(inp) # Enter input
press_and_release('enter') # Click enter
cURL = driver.current_url # Current URL
"Find data"
driver.get(cURL) # Goes to the HTML-page that appeared after clicking button
r = requests.get(cURL) # Get request for contents of the page
print(r.content) # Outputs HTML code for the page
soup = BeautifulSoup(r.content, 'html5lib') # Parse the data with BeautifulSoup(HTML-string, HTML-parser)
我想从页面收集温度。我知道我正在寻找的元素的 xpath 是
//[@id="dailyWeatherListItem0"]/div[2]/div1/span[2]/span1/text() //[@id="dailyWeatherListItem0"]/div[2]/div1/span[2]/span[3]/text() //[@id="dailyWeatherListItem1"]/div[2]/div1/span[2]/span1/text() //[@id="dailyWeatherListItem1"]/div[2]/div1/span[2]/span[3]/text() //[@id="dailyWeatherListItem2"]/div[2]/div1/span[2]/span1/text() //[@id="dailyWeatherListItem2"]/div[2]/div1/span[2]/span[3]/text() //[@id="dailyWeatherListItem3"]/div[2]/div1/span[2]/span1/text() //[@id="dailyWeatherListItem3"]/div[2]/div1/span[2]/span[3]/text()
//等等...
基本上我想收集以下两个元素九次:
//[@id="dailyWeatherListItem{NUMBERS0-8}"]/div[2]/div1/span[2]/span1/text() //[@id="dailyWeatherListItem{NUMBER0-8}"]/div[2]/div1/span[2]/span[3]/text()
我如何使用 driver.find_element_by_xpath 来做到这一点?还是有更高效的功能?
【问题讨论】:
-
您能否包括
press_and_release的定义以及所有必要的导入语句?
标签: python html selenium-webdriver web-scraping beautifulsoup