【问题标题】:How do I fix the following error message in R: "Error in data.frame arguments imply differing number of rows"?如何修复 R 中的以下错误消息:“data.frame 参数中的错误意味着行数不同”?
【发布时间】:2021-03-10 21:43:15
【问题描述】:

我是非常在 R 中编码的新手,我被分配了一项网络抓取任务。当我尝试创建数据框时(感谢 youtube!),我不断收到上面标题中列出的错误消息。您是否在下面的代码中发现任何明显的错误,或者您有任何解决问题的建议?谢谢!

link = "http://www.hockeycentral.co.uk/nhl/records/alltimegoal.php"
page = read_html(link)
rank = page %>% html_nodes('#example :nth-child(1)') %>% html_text()
player = page %>% html_nodes('.text-left:nth-child(2)') %>% html_text()
teams = page %>% html_nodes('.text-left:nth-child(3)') %>% html_text()
goals = page %>% html_nodes(':nth-child(4)') %>% html_text()
games = page %>% html_nodes(':nth-child(5)') %>% html_text()
assists = page %>% html_nodes(':nth-child(6)') %>% html_text()
points = page %>% html_nodes(':nth-child(7)') %>% html_text()
PPG = page %>% html_nodes(':nth-child(8)') %>% html_text()
SHG = page %>% html_nodes(':nth-child(9)') %>% html_text()

nhlcareer = data.frame(rank, player, teams, goals, games, assists, points, PPG, SHG, stringsAsFactors 
= FALSE)

【问题讨论】:

    标签: r dataframe web-scraping


    【解决方案1】:

    你可以使用html_table:

    library(rvest)
    
    link = "http://www.hockeycentral.co.uk/nhl/records/alltimegoal.php"
    page = read_html(link)
    table <- page %>% html_table() 
    table <- table[[1]]
    head(table)
    
      Rank        Player                                 Team(s) Goals Games Assists Points PPG SHG
    1    1 Wayne Gretzky                      EDM, LAK, STL, NYR   894  1487    1963   2857 204  73
    2    2   Gordie Howe                               DET, HFD.   801  1767    1049   1850 211  24
    3    3  Jaromir Jagr Pit, WSH, NYR, PHI, DAL, BOS, NJD, FLA.   766  1733    1155   1921 217  11
    4    4    Brett Hull                 CGY, STL, DAL, DET, PHX   741  1269     650   1391 265  20
    5    5 Marcel Dionne                           DET, LAK, NYR   731  1348    1040   1771 234  19
    6    6 Phil Esposito                           CHI, BOS, NYR   717  1282     873   1590 246  23
    

    您所做的逐列提取返回不同的列长度,例如:

    > length(player)
    [1] 201
    > length(rank)
    [1] 204
    

    由于 R 不知道如何将列放在一起,它会返回一条错误消息。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-06-18
      • 2014-11-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多