【问题标题】:Rvest: select content from XpathRvest:从 Xpath 中选择内容
【发布时间】:2021-11-28 12:07:55
【问题描述】:

我有以下源代码:

<div class="re__section-body re__detail-content js__section-body js__pr-description js__tracking" trackingid="lead-phone-ldp" trackinglabel="loc=Sale-Listing Details-body,prid=30449316">
khu vực dao động từ 45tr - 60tr/m2.<br>Liên hệ tôi chính chủ SĐT: 
<span class="hidden-phone hidden-mobile m-cover js__btn-tracking" tracking-id="lead-phone-ldp" tracking-label="loc=Sale-Listing Details-body,prid=30449316" raw="0935686566">0935686***</span>.
</div>

我尝试在使用此代码时提取原始元素,但结果为空

html_nodes(xpath = "//span[@class = 'hidden-phone hidden-mobile m-cover js__btn-tracking'") %>% 
html_attr("raw") %>% 
html_text()

我该怎么做? 非常感谢大家

【问题讨论】:

    标签: r web-scraping xpath rvest


    【解决方案1】:

    foo.html 成为您要从中提取节点属性的html文件:

    library(rvest)
    library(magrittr)
    
    read_html("foo.html") %>%
      html_nodes(xpath = "//span[@class='hidden-phone hidden-mobile m-cover js__btn-tracking']") %>%
      html_attr("raw") %>%
      as.numeric()
    

    html_attr("raw") 已经返回一个字符,因此不需要后续的html_text()

    【讨论】:

      猜你喜欢
      • 2017-12-20
      • 2013-09-06
      • 1970-01-01
      • 1970-01-01
      • 2011-04-21
      • 2019-06-17
      • 2011-08-16
      • 2018-09-12
      • 1970-01-01
      相关资源
      最近更新 更多