在 <div> 中抓取 <div>答案

【问题标题】：Scraping <div> inside a <div>在 <div> 中抓取 <div>
【发布时间】：2021-03-18 07:55:25
【问题描述】：

我在抓取 <div> 已经在 <div> 中的名称时遇到了一些麻烦（即使我尝试搜索特定的卡体，它也可以与完整的其他部分一起使用）

https://namemc.com/minecraft-names?sort=asc&length_op=&length=3&lang=&searches=500

我需要这部分：

<div class="card-body p-0">
   <div class="row no-gutters py-1 px-3">
      <div class="col col-lg order-lg-1 text-nowrap text-ellipsis">
         <a href="/name/example" translate="no">example</a>

即使我找到了名字，但它们不在我想要的列表中。有人知道如何找到它们吗？我正在使用 beautifulsoup 和 lxml。我的部分代码：

from bs4 import BeautifulSoup
import requests
html_text = requests.get('https://namemc.com/minecraft-names?sort=asc&length_op=&length=3&lang=&searches=500').text
soup = BeautifulSoup(html_text, 'lxml')
itemlocator = soup.find('div', class_='card-body p-0')
for items in itemlocator:
print(items)

【问题讨论】：

stackoverflow.com/help/someone-answers

标签： python html python-3.x web-scraping beautifulsoup

【解决方案1】：

以下脚本应生成您在该页面中看到的可用名称。但是，您似乎只在Commander 可用的容器之后。在这种情况下，您可以像下面这样尝试获得所需的部分，与您当前的尝试相比，它更简洁有效。

import requests
from bs4 import BeautifulSoup

link = 'https://namemc.com/minecraft-names?sort=asc&length_op=&length=3&lang=&searches=500'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36'
}

html_text = requests.get(link,headers=headers)
soup = BeautifulSoup(html_text.text,'lxml')
item = soup.select_one(".card-body > .no-gutters a[href^='/name/Commander']")
item_text = item.get_text(strip=True)
datetime = item.find_parent().find_parent().select_one("time").get("datetime")
print(item_text,datetime)

输出：

Commander 2021-03-19T13:10:40.000Z

【讨论】：

你仍然没有得到像Commander 这样的名字。
好的，在我上面粘贴的输出中查找它。
有什么办法解决吗？