【问题标题】:extract an element from the class with Beautiful Soup使用 Beautiful Soup 从类中提取元素
【发布时间】:2021-12-29 15:03:03
【问题描述】:
如何从 data-dat 中提取“2021-12-30”?
<td class="select_date" data-dat="2021-12-30" data-dat-f="30. Dec 2021" data-global-index="32" data-title="30.12.2021" id="d30-12-2021"><span class="bubble">30</span></td>
【问题讨论】:
标签:
python
web-scraping
beautifulsoup
【解决方案1】:
你可以试试attrs:
>>> soup.find("td").attrs['data-dat']
2021-12-30
完整示例:
# import module
from bs4 import BeautifulSoup
# create data
dom = """<td class="select_date" data-dat="2021-12-30" data-dat-f="30. Dec 2021" data-global-index="32" data-title="30.12.2021" id="d30-12-2021"><span class="bubble">30</span></td>"""
soup = BeautifulSoup(dom, 'html.parser')
# Extract "data-dat" attribute
print(soup.find("td").attrs['data-dat'])
# 2021-12-30