【发布时间】:2018-04-18 22:19:40
【问题描述】:
我正在使用 Jupyter Python 2.7 我试图从这个网站检索数据,一切顺利,使用 beautifulsoup 和 lxml 解析器来抓取描述或价格。
但是,当我试图抓取评论者的 cmets 或位置时,我无法检索任何内容,只有一个空列表 []
我也尝试过 PyQt4 先渲染它,但它仍然不起作用。我现在应该如何解决?
我的代码附在下面
import PyQt4
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import *
import sys
from lxml import html
from bs4 import BeautifulSoup
import os
import requests
site = 'https://www.bedbathandbeyond.com/store/product/dyson-v7-motorhead-cord-free-stick-vacuum-in-fuchsia-steel/1061083288?brandId=162'
class Render(QWebPage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().load(QUrl(url))
self.app.exec_()
def _loadFinished(self, result):
self.frame = self.mainFrame()
self.app.quit()
r = Render(site)
result = r.frame.toHtml()
formatted_result = str(result.toAscii())
tree = html.fromstring(formatted_result)
soup = BeautifulSoup(formatted_result,'lxml')
soup.find_all('span', class_ = 'BVRRValue BVRRUserLocation')#return value is []
非常感谢!
【问题讨论】:
标签: python python-2.7 web-scraping beautifulsoup pyqt4