在 Python 中抓取特定字段答案

【问题标题】：Webscraping specific fields in Python在 Python 中抓取特定字段
【发布时间】：2021-07-12 15:23:20
【问题描述】：

如何从这里提取公司及其描述？

从我昨天的question 中，我弄清楚了如何提取名称，但是当我应用相同的逻辑来提取它们的描述时，它适得其反。

request = requests.get("https://www.clstack.com", verify=False, headers=headers)
soup = bs4.BeautifulSoup(request.content, 'html.parser')
data = soup.find_all('td', {'class':'company'})

for i in data:
    print(i.find['tr'])

输出

company|description

desc 在“td”标签内，但是当我从代码中调用它时，我没有得到任何输出。

【问题讨论】：

没有与 desc 标签关联的类，这让我的理解更加混乱。
请edit 包含错误的完整回溯。
没有输出就是error.lol
@Byte 显然不会有输出。 td 标签没有任何 tr 标签。 td 在里面 tr
那么如何同时访问描述和公司名称呢？ html 是我的第一次，所以教程并没有真正的帮助，我很困惑。

标签： python python-3.x selenium web-scraping

【解决方案1】：

您会注意到<td class="company"> 标记后面跟着另一个带有描述的<td> 标记。因此，一旦您遍历 <td class="company"> 元素，只需使用 .find_next('td') 来获取带有描述的下一个标签：

import requests
import bs4

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'} # This is chrome, you can set whatever browser you like
request = requests.get("https://www.cloudtango.org", verify=False, headers=headers)
soup = bs4.BeautifulSoup(request.content, 'html.parser')
data = soup.find_all('td', {'class':'company'})

for each in data:
    company =  each.find('img')['alt']   
    description = each.find_next('td').text
    print(f'{company}: {description}\n\n')

输出：

Redcentric: Redcentric is a leading UK IT managed services provider that offers a range of IT and Cloud services designed to support organisations in their journey from traditional infrastructure to the Cloud …


Modern Networks: Established in 1999, Modern Networks is a leading provider of IT support, network services, business broadband and telecoms to the UK’s commercial property sector. Additionally, we work with around …


BlackPoint IT Services: BlackPoint’s comprehensive range of Managed IT Services is designed to help you improve IT quality, efficiency and reliability -and save you up to 50% on IT cost. Providing IT solutions for more …


AffinityMSP: AffinityMSP was created with one goal in mind: to help Australian businesses achieve success through high-performance technology. Our consultants take the time to get to know your business and …


centrexIT: Founded in 2002, centrexIT is San Diego's leader in IT management. Our locally-based technology professionals provide outsourced IT service, support, security and leadership for small and medium-…


Carbon60: Carbon60 specializes in delivering secure managed cloud solutions for public and private sector organizations with business-critical workloads. Businesses are at different stages in their cloud …


...

【讨论】：