【发布时间】:2021-03-12 12:27:12
【问题描述】:
我的数据:
xpath <- "/html/body[@class=\"das\"]/div/div[@class=\"ddd\"]/div[@class=\"spasd\"]/div[@class=\"josner(m/doandel\"]/p[@class=\"ficp\"]/a/strong[@class=\"asd\"]"
问题:
如何确定我的字符串 xpath 中的哪些标签有一个类?
预期输出:
c(FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE)
我尝试了什么:
# Try1: Split on / fails if / is present within class name
xpath %>% strsplit(split = "/") %>% .[[1]]
# Try2:
xp <- (strsplit(xpath, split = "")[[1]][-1]) %>% paste(collapse = "")
rr <- strsplit(xp, split = "\\[(?:.*?)\\]/")[[1]] %>% stringr::str_count(pattern = "/")
has_class <- lapply(rr, function(r) rep(!r, r + 1)) %>% unlist
编辑:
现在我想:如果我有相关文档,我可以解析目标标签并“上树”并检查类。
【问题讨论】: