网络从网站上抓取表格答案

【问题标题】：Web scraping the table from the website网络从网站上抓取表格
【发布时间】：2017-10-16 16:05:52
【问题描述】：

您好，我正在尝试从网络https://html5test.com/ 获取和解析所有表格数据。所以，我写了下面的代码。但它没有显示任何数据。我查看了问题的答案，但无法找出问题所在。

from BeautifulSoup import BeautifulSoup
from urllib2 import urlopen
import re


url='https://html5test.com/'
data=urlopen(url)

parse=BeautifulSoup(data).findAll('div', attrs={'class': 'resultsTable detailsTable'})

【问题讨论】：

表数据在post请求中
你有代码如何解析完整的表格数据吗？我是使用 python 和 beautifulsoup 进行网页抓取的新手。

标签： python web-scraping beautifulsoup

【解决方案1】：

查看源代码（在 Chrome 中查看源代码：https://html5test.com/），我实际上并没有找到“resultsTable”类。看起来这是使用 JavaScript 动态生成的。你需要一个渲染 JavaScript 的爬虫，例如 Scrapy with Splash (cf. https://blog.scrapinghub.com/2015/03/02/handling-javascript-in-scrapy-with-splash/)。

【讨论】：

使用chrome开发者工具可以看到“resultsTable detailsTable”类。
@jisan 不，不要依赖开发者工具，因为它会在页面加载后显示数据，请始终从view source code 进行验证