【问题标题】:Unable to scrape data from this site (Using R)无法从该站点抓取数据(使用 R)
【发布时间】:2021-03-17 19:47:39
【问题描述】:

我似乎无法确定与 RSelenium 一起使用以返回任何数据的正确 css 选择器。 网址是:https://www.rbcroyalbank.com/investments/gic-rates.html

所需数据是不可赎回 GIC 利率,每年支付的利息(第二列):1、2、3、4、5、7、10

一些失败的努力

library("RSelenium")
library("rvest")
library("httr")
library("tidyverse")

remDr$navigate("https://www.rbcroyalbank.com/investments/gic-rates.html")
webElem <- remDr$findElement(using = "css selector", value = "tr:nth-child(7) .text-center:nth-child(2) div")


# OR

pg <- remDr$getPageSource()[[1]]
df <- tibble(Rates = pg %>% 
               read_html() %>% 
               html_nodes(xpath = '//tr[(((count(preceding-sibling::*) + 1) = 6) and parent::*)]//*[contains(concat( " ", @class, " " ), concat( " ", "text-center", " " )) and (((count(preceding-sibling::*) + 1) = 2) and parent::*)]//div') %>% 
               html_text())

【问题讨论】:

    标签: r web-scraping rselenium


    【解决方案1】:

    低于可能的解决方案。

    #Library to scrape the infomration Version 1.7.7 (mandatory)
    library(RSelenium) 
    driver <- rsDriver(browser=c("firefox"), port = 4567L)
    
    #Defines the client part.
    remote_driver <- driver[["client"]]
    remote_driver$navigate("https://www.rbcroyalbank.com/investments/gic-rates.html")
    webElem <- remote_driver$findElement(using = "css selector", value = "#gic-nrg")$clickElement()
    x<-remote_driver$findElement(using = "css selector", value = "#guaranteed-return-1 > div:nth-child(1) > table:nth-child(1)")
    df<-read.table(text=gsub(' ', '\n', x$getElementText()), header=TRUE)
    df[c(-1:-46),]
    

    【讨论】:

    • 嗨。您是如何确定这些值的:“#gic-nrg”、“#guaranteed-return-1 > div:nth-child(1) > table:nth-child(1)”?
    • 嗨,对于#gic-nrg,我在检查中选择了这一行&lt;a href="#guaranteed-return-1" data-toggle="collapse" class="collapse-toggle a-nonredeemable-gic collapsed" id="gic-nrg" ga-on="click" ga-event-category="Investments - GIC Rates" ga-event-action="Click" ga-event-label="GIC Rates - Non-Redeemable GIC" aria-expanded="false"&gt;Non-Redeemable GIC&lt;/a&gt; 中的“复制选择器”,对于另一个相同的选择,但对于另一行@ 987654324@
    猜你喜欢
    • 2019-08-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多