【发布时间】:2020-07-25 20:38:30
【问题描述】:
我正在尝试使用 beautifulsoup 或来自此 Facebook 页面https://www.facebook.com/marketplace/item/1612977352197759/的请求来提取文本
文字是物品描述,地图前的文字: 这是我到目前为止尝试过但没有工作的方法:
import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from lxml import html
opt = Options()
opt.add_argument("--disable-infobars")
opt.add_argument("start-maximized")
# Pass the argument 1 to allow and 2 to block
opt.add_experimental_option("prefs", {
"profile.default_content_setting_values.media_stream_mic": 2,
"profile.default_content_setting_values.media_stream_camera": 2,
"profile.default_content_setting_values.geolocation": 2,
"profile.default_content_setting_values.notifications": 2
})
global driver
driver = webdriver.Chrome(chromedriver)
driver.get('https://www.google.com')
page = requests.get('https://www.facebook.com/marketplace/item/1612977352197759/?ref=messenger_banner')
tree = html.fromstring(page.content)
print(tree)
link = tree.xpath("//span[contains(string(),'hello')]")
print(link)
【问题讨论】:
标签: python-3.x beautifulsoup python-requests