【问题标题】:Python: Web Scraping Attribute Error (Resultset)Python:网页抓取属性错误(结果集)
【发布时间】:2021-01-05 00:49:35
【问题描述】:
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
import requests

url = 'https://en.wikisource.org/wiki/Main_Page'
r = requests.get(url)

Soup = BeautifulSoup(r.text, "html5lib")
List = Soup.find("div",class_="enws-mainpage-widget-content", id="enws-mainpage-newtexts-content").find_all('a')
ebooks=[]
i=0
for ebook in List:
    x=ebook.get('title')
    for ch in x:
        if(ch==":"):
            x=""
    if x!="":
        ebooks.append(x)
        i=i+1
        

inputnumber=0
while inputnumber<len(ebooks):
    print(inputnumber+1, " - ", ebooks[inputnumber])
    inputnumber=inputnumber+1
input=int(input("Please select a book: "))
selectedbook = Soup.find("a", title=ebooks[input-1])
print(selectedbook['title'])
url1 = "https://en.wikisource.org/"+selectedbook['href']
r1 = requests.get(url1)
Soup1 = BeautifulSoup(r1.text, "html5lib")
List1 = Soup1.find_all("div", class_="prp-pages-output").find_all('p')
words=str(List1)
ebook1= open('ebook1.txt', 'w', encoding="utf-8")
ebook1.write(words)
ebook1.close()

我正在尝试从该网站下载用户选择的电子书:'https://en.wikisource.org/wiki/Main_Page'

一切都很好,直到我尝试从所选书籍中获取 paragraghs。 我在 List1 行收到此错误:

Traceback (most recent call last):
  File "homework.py", line 32, in <module>
    List1 = Soup1.find_all("div", class_="prp-pages-output").find_all('p')
  File "C:\Users\Özdal\AppData\Local\Programs\Python\Python38-32\lib\site-packages\bs4\element.py", line 2173, in __getattr__
    raise AttributeError(
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

如果我把它改成这样:

List1 = Soup1.find("div", class_="prp-pages-output").find_all('p')

代码只给了我第一个 div,但我需要所有的 div。我该怎么办?

【问题讨论】:

  • 我建议您看一看像 PEP 8 这样的风格指南。

标签: python web web-scraping


【解决方案1】:

以下将为您提供列表列表

List1 = [x.find_all('p') for x in Soup1.find_all("div", class_="prp-pages-output")]

然后你可以像这样变平

flat_list1 = [item for sublist in List1 for item in sublist]

【讨论】:

  • 您介意投票/接受吗?谢谢!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-07-22
  • 2022-01-21
  • 1970-01-01
  • 2016-07-23
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多