Python：没有找到匹配模式'.+'的表答案

【问题标题】：Python: No tables found matching pattern '.+'Python：没有找到匹配模式'.+'的表
【发布时间】：2018-04-12 02:52:57
【问题描述】：

我想要做的是将这个表导出为 CSV，用于 Python 脚本中的所有 7 页，每页 100 行，但在脚本下方遇到此错误。

“http://www.nhl.com/stats/player?aggregate=1&gameType=2&report=points&pos=S&reportType=game&startDate=2017-10-19&endDate=2017-10-29&filter=gamesPlayed,gte,1&sort=points,goals”

import pandas as pd

dfs = pd.read_html('http://www.nhl.com/stats/player?aggregate=1&gameType=2&report=skatersummary&pos=S&reportType=game&startDate=2017-10-19&endDate=2017-10-29&filter=gamesPlayed,gte,1&sort=points,goals,assists')
df = pd.concat(dfs, ignore_index=True)
df.to_csv("1019_1029.csv", index=False)
print(df)

ValueError: 没有找到匹配模式 '.+' 的表

【问题讨论】：

从代码中你应该得到 Undefined df 的错误，因为你没有在使用前分配它。您是否使用 Jupyter Notebook 来编辑和启动您的代码？请记住 - 它会存储全局状态，直到您执行“内核重启”。
我并不是要评论。我正在尝试一些东西，却意外地离开了它。我只是使用 python shell。

标签： python pandas csv

【解决方案1】：

此站点不适用于pandas.read_html。根据pandas documentation：

此函数搜索
元素，并且仅搜索和或
行以及表中每个

元素内的元素。代表“表格数据”。
但您尝试解析的站点使用

元素将数据结构化到表中：
因此，您将需要自定义解析解决方案来从该站点读取数据。

【讨论】：

使用类名你可以把这个html转换成<table>、<tr>、<thead>等。你可以使用一个html解析器库比如beautifulsoup来转换它，然后传递输出到pandas.read_html。 stackoverflow.com/questions/5289189/…

在这种情况下，Haken 会是什么样子？