【发布时间】:2019-06-03 04:26:56
【问题描述】:
我有一个HTML Page,其中包含多个 div,例如:
<div class="post-info-wrap">
<h2 class="post-title"><a href="https://www.example.com/blog/111/this-is-1st-post/" title="Example of 1st post – Example 1 Post" rel="bookmark">sample post – example 1 post</a></h2>
<div class="post-meta clearfix">
<div class="post-info-wrap">
<h2 class="post-title"><a href="https://www.example.com/blog/111/this-is-2nd-post/" title="Example of 2nd post – Example 2 Post" rel="bookmark">sample post – example 2 post</a></h2>
<div class="post-meta clearfix">
我需要使用 post-info-wrap 类获取所有 div 的值我是 BeautifulSoup 的新手
所以我需要这些网址:
等等……
我试过了:
import re
import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.example.com/blog/author/abc")
data = r.content # Content of response
soup = BeautifulSoup(data, "html.parser")
for link in soup.select('.post-info-wrap'):
print link.find('a').attrs['href']
此代码似乎不起作用。我不熟悉美丽的汤。如何提取链接?
【问题讨论】:
标签: python beautifulsoup