【问题标题】:Access elements from R's Rd file?从 R 的 Rd 文件中访问元素?
【发布时间】:2013-07-28 14:13:18
【问题描述】:

我希望浏览一个包并发现每个函数的帮助文件中提到的作者是谁。

我寻找了一个从 R 的帮助文件中提取元素的函数,并且可以找到一个。我能找到的最接近的是this post,来自 Noam Ross。

这样的功能存在吗? (如果没有,我想我会破解 Noam 的代码以解析 Rd 文件,并找到我感兴趣的特定元素)。

谢谢,塔尔。

潜在代码示例:

get_field_from_r_help(topic="lm", field = "Description") #
# output:

‘lm’用于拟合线性模型。它可以用来执行 回归,单层方差分析和分析 协方差(尽管“aov”可能提供更方便的界面 对于这些)。

【问题讨论】:

  • 输入输出示例?
  • Joshua - 它不是重复的,因为它只涉及提取整个文本的步骤,而不涉及如何解析它。 Spacedman - 一分钟内。
  • @TalGalili:你不需要解析它;你只需要提取你想要的部分。通过将grep 用于您想要的节标题来做到这一点,然后获取所有文本直到下一节。使用 HTML 版本的帮助可能会更容易,描述为 here
  • 谢谢 Joshua 和 Hadley,看来信息已经足够我玩了。

标签: r parsing rd


【解决方案1】:

This documentThis documentthis SO postthis SO postthis SO post 对解析 Rd 文件很有帮助。

根据这些,您可能可以尝试以下方法:

getauthors <- function(package){
    db <- tools::Rd_db(package)
    authors <- lapply(db,function(x) {
        tags <- tools:::RdTags(x)
        if("\\author" %in% tags){
            # return a crazy list of results
            #out <- x[which(tmp=="\\author")]
            # return something a little cleaner
            out <- paste(unlist(x[which(tags=="\\author")]),collapse="")
        }
        else
            out <- NULL
        invisible(out)
        })
    gsub("\n","",unlist(authors)) # further cleanup
}

然后我们可以在一两个包上运行它:

> getauthors("knitr")
                                                                                     d:/RCompile/CRANpkg/local/3.0/knitr/man/eclipse_theme.Rd 
                                                                                                                     "  Ramnath Vaidyanathan" 
                                                                                         d:/RCompile/CRANpkg/local/3.0/knitr/man/image_uri.Rd 
                                                                                                                    "  Wush Wu and Yihui Xie" 
                                                                                      d:/RCompile/CRANpkg/local/3.0/knitr/man/imgur_upload.Rd 
                                                                              "  Yihui Xie, adapted from the imguR package by Aaron  Statham" 
                                                                                          d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2pdf.Rd 
                                                                                         "  Ramnath Vaidyanathan, Alex Zvoleff and Yihui Xie" 
                                                                                           d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2wp.Rd 
                                                                                                          "  William K. Morris and Yihui Xie" 
                                                                                        d:/RCompile/CRANpkg/local/3.0/knitr/man/knit_theme.Rd 
                                                                                                       "  Ramnath Vaidyanathan and Yihui Xie" 
                                                                                     d:/RCompile/CRANpkg/local/3.0/knitr/man/knitr-package.Rd 
                                                                                                            "  Yihui Xie <http://yihui.name>" 
                                                                                        d:/RCompile/CRANpkg/local/3.0/knitr/man/read_chunk.Rd 
                      "  Yihui Xie; the idea of the second approach came from  Peter Ruckdeschel (author of the SweaveListingUtils  package)" 
                                                                                       d:/RCompile/CRANpkg/local/3.0/knitr/man/read_rforge.Rd 
                                                                                                          "  Yihui Xie and Peter Ruckdeschel" 
                                                                                           d:/RCompile/CRANpkg/local/3.0/knitr/man/rst2pdf.Rd 
                                                                                                               "  Alex Zvoleff and Yihui Xie" 
                                                                                              d:/RCompile/CRANpkg/local/3.0/knitr/man/spin.Rd 
"  Yihui Xie, with the original idea from Richard FitzJohn  (who named it as sowsear() which meant to make a  silk purse out of a sow's ear)" 

也许还有工具

> getauthors("tools")
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/bibstyle.Rd 
                                                                "  Duncan Murdoch" 
                   D:/murdoch/recent/R64-3.0/src/library/tools/man/checkPoFiles.Rd 
                                                                "  Duncan Murdoch" 
                        D:/murdoch/recent/R64-3.0/src/library/tools/man/checkRd.Rd 
                                                  "  Duncan Murdoch, Brian Ripley" 
                     D:/murdoch/recent/R64-3.0/src/library/tools/man/getDepList.Rd 
                                                                   " Jeff Gentry " 
                      D:/murdoch/recent/R64-3.0/src/library/tools/man/HTMLlinks.Rd 
                                                    "Duncan Murdoch, Brian Ripley" 
            D:/murdoch/recent/R64-3.0/src/library/tools/man/installFoundDepends.Rd 
                                                                     "Jeff Gentry" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/makeLazyLoading.Rd 
                                                   "Luke Tierney and Brian Ripley" 
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/parse_Rd.Rd 
                                                                " Duncan Murdoch " 
                     D:/murdoch/recent/R64-3.0/src/library/tools/man/parseLatex.Rd 
                                                                  "Duncan Murdoch" 
                        D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2HTML.Rd 
                                                  "  Duncan Murdoch, Brian Ripley" 
                 D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2txt_options.Rd 
                                                                  "Duncan Murdoch" 
                   D:/murdoch/recent/R64-3.0/src/library/tools/man/RdTextFilter.Rd 
                                                                "  Duncan Murdoch" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/SweaveTeXFilter.Rd 
                                                                  "Duncan Murdoch" 
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/texi2dvi.Rd 
                     "  Originally Achim Zeileis but largely rewritten by R-core." 
                  D:/murdoch/recent/R64-3.0/src/library/tools/man/tools-package.Rd 
"  Kurt Hornik and Friedrich Leisch  Maintainer: R Core Team R-core@r-project.org" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteDepends.Rd 
                                                                   " Jeff Gentry " 
                 D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteEngine.Rd 
                                            "Duncan Murdoch and Henrik Bengtsson." 
                  D:/murdoch/recent/R64-3.0/src/library/tools/man/writePACKAGES.Rd 
                                                        "  Uwe Ligges and R-core."

有些函数没有作者字段,所以当它在getauthors 的末尾调用unlist 时,它只会删除那些,但可以稍微修改代码以返回NULL 的值。

此外,进一步的解析将变得有点困难,因为包作者似乎以非常不同的方式使用该字段。 devtools 中只有一个作者字段。 car 中有一堆,每个都包含一个电子邮件地址。等等等等。但这会让你得到可用的信息,你应该能够进一步使用这些信息。

注意:如果您拥有 Rd 文件的完整路径,我以前版本的此答案提供了一个解决方案,但如果您尝试对已安装的软件包执行此操作,则该解决方案不起作用。按照 Tyler 的建议,我制定了一个更完整的解决方案。

【讨论】:

  • 你能给我们看一个包的例子,你在其中运行所有的.Rd文件并抓住作者。我尝试了这种方法,但未能成功,我希望看到这种更清洁的方法奏效。
  • @TylerRinker 查看更新。我在几个包上尝试过,似乎可以正常工作。
  • 很好的回答托马斯 - 非常感谢。既有信息又有用:)
  • 亲爱的 Thomas,我想与您分享感谢您的代码,我能够将“package_authors”功能添加到安装程序包中 - 这有助于我在DESCRIPTION文件中给予人们信任.谢谢! (例如:github.com/talgalili/installr
【解决方案2】:

这是我使用其他人提出的一些建议的方法:

package <- "qdap"
funs <- unclass(lsf.str(envir = asNamespace(package)))

out <- sapply(funs, function(x) {
    x <- try(capture.output(tools:::Rd2txt(utils:::.getHelpFile(as.character(help(x, help_type="text"))))))
    Auth_lines <- grep("_\bA_\bu_\bt_\bh_\bo_\br(_\bs):", x, fixed = TRUE) 
    if (identical(Auth_lines, integer(0))) {
        return(NA)
    }
    gsub("^\\s+|\\s+$", "", x[Auth_lines +2])
})

## To look at just the ones with author fields:
out[!sapply(out, is.na)]

## > out[!sapply(out, is.na)]
##                                                         beg2char 
##                   "Josh O'Brien, Justin Haynes and Tyler Rinker" 
##                                                         bracketX 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                    bracketXtract 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                         char2end 
##                   "Josh O'Brien, Justin Haynes and Tyler Rinker" 
##                                                 cm_df.transcript 
## "DWin, Gavin Simpson and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                            gantt 
##           "DigEmAll (<URL: stackoverflow.com>) and Tyler Rinker" 
##                                                       gantt_wrap 
##     "Andrie de Vries and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                             genX 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                        genXtract 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                             hash 
##      "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                         name2sex 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                  read.transcript 
##      "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                      sentCombine 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                        sentSplit 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                              TOT 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                          v.outer 
##   "Vincent Zoonekynd and Tyler Rinker <tyler.rinker@gmail.com>." 

【讨论】:

  • 嗨 Tyler,很好的答案(你会得到 +1,因为 Thomas 似乎找到了一些更好的依赖函数)。谢谢:)
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2016-08-24
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多