【问题标题】:Python BeautifulSoup - find table by id returning 'none'Python BeautifulSoup - 通过返回“无”的 id 查找表
【发布时间】:2019-04-18 01:44:24
【问题描述】:

由于某种原因,我无法按 id 找到表或按 id 选择表。我一直在参考 BS 的文档,据我所知它应该可以工作。

下面是尝试通过 id "per_game" 选择表格的代码示例,content.find(id='per_game') 对我也不起作用。

我一直在参考文档的查找和 CSS 选择器部分,这里:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find

import requests
import csv
import calendar
from datetime import date, datetime, timedelta
from collections import OrderedDict, defaultdict
from bs4 import BeautifulSoup as soup

season = str(date.today().year + 1)
month = calendar.month_name[date.today().month].lower()

teamUrl = "https://basketball-reference.com/teams/"

urls       =    [teamUrl + 'ATL/' + season +'.html'] # Atlanta Hawks
                 # teamUrl + 'BOS/' + season +'.html', # Boston Celtics
                 # teamUrl + 'BKN/' + season +'.html', # Brooklyn Nets
                 # teamUrl + 'CHA/' + season +'.html', # Charlotte Hornets

for url in urls:
    page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
    content = soup(page.content, 'html.parser')
    table = content.select("#per_game")
    print(table)

非常感谢, 哦。

【问题讨论】:

标签: python beautifulsoup


【解决方案1】:

这不是 Ajax,只是从 html 中删除注释

page = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
html_doc = page.text.replace('<!--', '').replace('-->', '')
content = soup(html_doc, 'html.parser')

【讨论】:

    猜你喜欢
    • 2020-08-02
    • 2017-04-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-06-23
    • 2019-03-06
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多