【发布时间】:2016-01-02 08:59:57
【问题描述】:
# parse PubMed data
library(XML) # xpath
library(rentrez) # entrez_fetch
pmids <- c("25506969","25032371","24983039","24983034","24983032","24983031","26386083",
"26273372","26066373","25837167","25466451","25013473","23733758")
# Above IDs are mix of Books and journal articles
# ID# 23733758 is an journal article and has No abstract
data.pubmed <- entrez_fetch(db = "pubmed", id = pmids, rettype = "xml",
parsed = TRUE)
abstracts <- xpathApply(data.pubmed, "//Abstract", xmlValue)
names(abstracts) <- pmids
如果每条记录都有一个摘要,效果会很好。但是,当 PMID (#23733758) 没有已发布的摘要(或书籍文章或其他内容)时,它会跳过导致错误 'names' attribute [5] must be the same length as the vector [4]
问:如何传递多个路径/节点,以便提取期刊文章、书籍或评论?
更新:hrbrmstr 解决方案有助于解决 NA。但是,xpathApply 可以像c(//Abstract, //ReviewArticle , etc etc ) 这样的多个节点吗?
【问题讨论】:
-
您可以使用
try()或tryCatch() -
嗨,理查德,不确定我是否理解您的解决方案。如果我的输入是 5 个 PMID,我的目标是获得 5 个摘要的输出。如果没有摘要,它仍应返回空值(4 个摘要和 1 个 Null)。所以,当我添加 PMID 作为名称时,我会知道哪个 PMID 没有抽象信息。