Beautifulsoup AttributeError：“列表”对象没有属性“文本”答案

【问题标题】：Beautifulsoup AttributeError: 'list' object has no attribute 'text'Beautifulsoup AttributeError：“列表”对象没有属性“文本”
【发布时间】：2018-10-09 12:28:10
【问题描述】：

我有以下html代码：

<div>
    <span class="test">
     <span class="f1">
      5 times
     </span>
    </span>

    </span>
   </div>

<div>

</div>

<div>
    <span class="test">
     <span class="f1">
      6 times
     </span>
    </span>

    </span>
   </div>

我设法在树中导航，但在尝试打印时出现以下错误：

AttributeError: 'list' object has no attribute 'text'

Python 代码工作：

x=soup.select('.f1')
print(x)

给出以下内容：

[]
[]
[]
[]
[<span class="f1"> 19 times</span>]
[<span class="f1"> 12 times</span>]
[<span class="f1"> 6 times</span>]
[]
[]
[]
[<span class="f1"> 6 times</span>]
[<span class="f1"> 1 time</span>]
[<span class="f1"> 11 times</span>]

但是print(x.prettify) 会抛出上述错误。我基本上是在尝试获取所有实例的跨度标签之间的文本，没有时为空白，可用时为字符串。

【问题讨论】：

不应该抛出：AttributeError: 'list' object has no attribute 'prettify' 吗？

标签： python beautifulsoup

【解决方案1】：

select() 返回结果列表，无论结果是否有 0 项。由于list 对象没有text 属性，它为您提供AttributeError。

同样，prettify() 是为了使 html 更具可读性，而不是解释 list 的方式。

如果您只想在可用时提取texts：

texts = [''.join(i.stripped_strings) for i in x if i]

# ['5 times', '6 times']

这将删除字符串中所有多余的空格/换行符，只为您提供裸文本。最后一个if i 表示仅在i 不是None 时返回text。

如果您真的关心空格/换行符，请改为这样做：

texts  = [i.text for i in x if i]

# ['\n      5 times\n     ', '\n      6 times\n     ']

【讨论】：

【解决方案2】：

from bs4 import BeautifulSoup
html = '''<div>
    <span class="test">
     <span class="f1">
      5 times
     </span>
    </span>
    </span>
   </div>
<div>
</div>
<div>
    <span class="test">
     <span class="f1">
      6 times
     </span>
    </span>
    </span>
   </div>'''


soup = BeautifulSoup(html, 'html.parser')
aaa = soup.find_all('span', attrs={'class':'f1'})
for i in aaa:
    print(i.text)

输出：

5 times
6 times

【讨论】：

【解决方案3】：

我建议您使用.findAll 方法并循环匹配的跨度。

例子：

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'lxml')

for span in soup.findAll("span", class_="f1"):
    if span.text.isspace():
        continue
    else:
        print(span.text)

.isspace() 方法正在检查字符串是否为空（检查字符串是否为 True 在这里不起作用，因为空的 html 跨度为空格）。

【讨论】：