将 HTML 代码读入 R 以进行数据和文本挖掘答案

【问题标题】：Read HTML code into R for data & text mining将 HTML 代码读入 R 以进行数据和文本挖掘
【发布时间】：2015-05-03 09:22:39
【问题描述】：

我正在尝试将本网站上的信息读入 R 以进行数据和文本分析：

http://www.nhl.com/scores/htmlreports/20142015/PL020916.HTM

我尝试使用以下包和代码将源代码读入 R：

library(XML)
theurl <- "http://www.nhl.com/scores/htmlreports/20142015/PL020916.HTM"
tables <- readHTMLTable(theurl)

con = url("http://www.nhl.com/scores/htmlreports/20142015/PL020916.HTM")
htmlCode=readLines(con)
close(con)
htmlCode

我正在寻找作为所提供信息的平面文件的输出。

【问题讨论】：

我没有使用 r，但是今天看到了一些关于这个的东西，这可能就是你要找的东西：github.com/hadley/rvest

标签： html r data-mining text-mining

【解决方案1】：

我不确定您要从您提供的页面中查找什么信息，但您可以通过以下方式使用 rvest 阅读它...

url <- "http://www.nhl.com/scores/htmlreports/20142015/PL020916.HTM"
library("rvest")
url %>% html()

【讨论】：