Beautifulsoup python 没有找到我要找的东西答案

【问题标题】：beautifulsoup python doesnt find what i am looking forBeautifulsoup python 没有找到我要找的东西
【发布时间】：2020-11-01 04:08:30
【问题描述】：

所以我试图提取总决赛部分的文字（获胜队名称） https://i.stack.imgur.com/4QPqI.png

我的问题是我要提取的文本没有被汤找到，它最多只能找到 (class="sgg2h1cC DEPRECATED_bootstrap_container undefined native-scroll dragscroll") 但正如您在此处看到的： https://i.imgur.com/Brmv6ba.png 还有更多。

这是我的代码，有人可以解释我如何获得我正在寻找的信息吗？我对网页抓取也很陌生

from bs4 import BeautifulSoup

URL = 'https://smash.gg/tournament/revolve-oceania-2v2-finale/event/revolve-oceania-2v2-finale-event/brackets/841267/1343704'
page = requests.get(URL)

soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id="app_feature_canvas")
a = results.find_all('div', class_="regionWrapper-APP_TOURNAMENT_PAGE-FeatureCanvas")
print()
for b in a:
    c = b.find('div', class_="page-section page-section-grey")
    print(c)

【问题讨论】：

标签： python web-scraping beautifulsoup

【解决方案1】：

您在检查器中看到的与使用requests 时看到的不同。不要使用开发控制台，而是查看页面源代码。

页面的这些部分是由 JavaScript 生成的，因此当您通过 requests 请求页面时不会出现。

URL = 'https://smash.gg/tournament/revolve-oceania-2v2-finale/event/revolve-oceania-2v2-finale-event/brackets/841267/1343704'
page = requests.get(URL)
print(page.text)  # notice this is nothing like what you see in the inspector

要执行 javascript，请考虑使用 selenium 而不是请求。

from selenium import webdriver
driver = webdriver.Chrome()
driver.get(URL)
html = driver.page_source  # DOM with JavaScript execution complete
soup = BeautifulSoup(html)
# ... go from here

或者，页面源中可能有足够的信息来获取您要查找的内容。请注意，页面源中有很多 JSON 包含各种信息，大概 JS 可能会使用这些信息来填充这些元素。

另外，您还可以在检查器中从 DOM 浏览器复制/粘贴。（右击html元素，点击“复制外层html”）

html = pyperclip.paste()  # put contents of the clipboard into a variable
soup = BeautifulSoup(html)
results = soup.find(id="app_feature_canvas")
a = results.find_all('div', class_="regionWrapper-APP_TOURNAMENT_PAGE-FeatureCanvas")
print()
for b in a:
    c = b.find('div', class_="page-section page-section-grey")
    print(c)

这行得通:-)

【讨论】：

但我希望它是自动的而不是复制和粘贴
在这种情况下，使用 selenium 或提到的其他方法之一。
当你说“注意有很多 JSON”时，我没有看到任何 JSON
我真的不明白
在第 4700 行附近，我看到了 a ton of JSON。扫描它，它似乎包括用户名、展示位置等内容。