查找中的python beautifulsoup findall答案

【问题标题】：python beautifulsoup findall within find查找中的python beautifulsoup findall
【发布时间】：2016-03-25 01:37:29
【问题描述】：

我正在尝试获取 td class 'column-1' 中的文本，但我遇到了一些麻烦，因为它没有属性文本 - 但它显然这样做了，所以我一定做错了什么。这是代码：

import urllib
import urllib.request
from bs4 import BeautifulSoup

theurl="http://vermontamerican.com/products/standard-drill-bit-extensions/"
thepage = urllib.request.urlopen(theurl)
soup = BeautifulSoup(thepage,"html.parser")

for part in soup.find_all('td'),{"class":"column-1"}:
    part1 = part.text
    print(part1)

如果我取出第 2 行并在上面打印“部分”，我会得到一个结果，但它给出了所有 td 而不仅仅是第 1 列。我也试过这个，但我是新手，所以我确信这在多个方面是错误的。

import urllib
import urllib.request
from bs4 import BeautifulSoup

theurl="http://vermontamerican.com/products/standard-drill-bit-extensions/"
thepage = urllib.request.urlopen(theurl)
soup = BeautifulSoup(thepage,"html.parser")


for part in soup.find('tbody'),{"class":"row-hover"}:
    for part1 in part.find_all('a'):
        print(part1)

【问题讨论】：

标签： python html python-3.x web-scraping beautifulsoup

【解决方案1】：

您没有将属性选择字典传递给find_all() 函数。替换：

for part in soup.find_all('td'),{"class":"column-1"}:

与：

for part in soup.find_all('td', {"class":"column-1"}):

现在您的代码将生成：

17103
17104

【讨论】：