抓取一个 DIV，然后抓取该 DIV 的兄弟姐妹中的所有 <a>答案

【问题标题】：Scraping a DIV then grabbing all <a> in the siblings of that DIV抓取一个 DIV，然后抓取该 DIV 的兄弟姐妹中的所有 <a>
【发布时间】：2018-03-27 11:46:14
【问题描述】：

我要做的是获取当前日期并将其存储到一个变量中，然后使用该变量在 DIV 中查找日期。一旦找到该 DIV，我希望它获取子（兄弟）DIV 中的所有 <a ref> 链接。

import re, time
from urllib2 import urlopen as uReq
import datetime as dt

from bs4 import BeautifulSoup as soup
my_url = 'www.domainname.com'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

date = dt.datetime.today().strftime("%m-%d-%Y")
page_soup = soup(page_html, "lxml") 

### I feel like I'm missing something here!
### Need to add variable (date) to find DIV (i.e. 2017-10-21)
### Add H REF links from all sub DIVs within the variable (date). which I believe would use the code below?

links = page_soup.findAll('div', attrs={'class' : 'gameLinks'})
for div in links:
    link = div.find('a')['href']
    if "ufc" in link:  
        print """<a href="{link}">{link}</a><br>""".format(link=link)

有什么想法吗？

【问题讨论】：

Select all div siblings by using BeautifulSoup的可能重复

标签： python web-scraping beautifulsoup

【解决方案1】：

更改每个模块的名称会使您的代码非常难以阅读，但除此之外，这是在 div（或任何地方）内查找链接的典型约定：

links = []

for div in page_soup.find_all('div', attrs={'class': 'gameLinks'}:
    if 'some_date' in div.text:
        for a in div.find_all('a'):
            if 'ufc' in a['href']:
                links.append(a['href'])
                print(a['href'])

【讨论】：

我明白，但是，我试图找到一个带有特定文本的 div，例如 <div class="var2">2017-10-21</div>，然后告诉脚本抓取 2017-10-21 的子 div 中的所有 <a href> 标签。
应该以一致的方式命名 div - 检查源代码以找到它。只需将标识符放入attrs。
如果您要查找的 div 中包含特定日期，您可以使用 if 语句对其进行过滤：if '2017-10-21' in div.text:
在上面的代码中会去哪里？ - 谢谢埃文
我打印链接的旧方法怎么样？到目前为止，这对我不起作用。 :(