有效地将来自多个文件的源代码整合到一个文件中，包括（Roxygen）注释答案

【问题标题】：Efficiently consolidating source code from multiple files to a single file including (Roxygen) comments有效地将来自多个文件的源代码整合到一个文件中，包括（Roxygen）注释
【发布时间】：2012-09-12 15:28:08
【问题描述】：

问题

将源代码从多个源文件合并到单个源文件（包括 Roxygen 文档 cmets 并可能还包括其他 cmets）的有效方法是什么？

我记得有一个来自某个包的解析方法可以解释 cmets（将它放在某个属性“字段”中），但我再也找不到它了。

背景资料

主要出于两个原因，我喜欢将源代码从给定数量的源文件合并到单个源文件的灵活性：

为了保持清醒，我坚持“每个文件一个定义”的范例，其中每个源文件都包含一个定义（函数、S4 方法、S4 引用类等）。此外，这些源文件可能存储在我的“源目录”的各个子目录中，因此甚至可能具有相同的文件名。然而，为了组合一个真正的 R 包，有时最好将多个 def 组合到一个源文件中。如果文件名重复，我什至需要这样做。
并行化时，能够将所有必要的源代码分组到一个文件中并将其推送到工作进程以便他们可以获取代码是很方便的

家庭作业

这是我目前的解决方案；感觉“还可以”，但是

我觉得可能有更好、更高效的方法
在检测 Roxygen 代码方面似乎有点脆弱

创建示例源文件

foo1 <- function(x) {message("I'm foo #1"); return(TRUE)}
roxy.1 <- c(
    "#' Title foo1()",
    "#'", 
    "#' Description foo1().",
    "##' This line is commented out",
    "#'", 
    "#' @param x Some R object that doesn't matter.",
    "#' @return \\code{TRUE}.",
    "#' @references \\url{http://www.something.com/}",
    "#' @author Janko Thyson \\email{john.doe@@something.com}",
    "#' @seealso \\code{\\link{foo2}}",
    "#' @example inst/examples/foo1.R"
)

foo2 <- function(y) {message("I'm foo #2"); return(FALSE)}
roxy.2 <- c(
    "#' Title foo2()",
    "#'", 
    "#' Description foo2().",
    "##' This line is commented out",
    "#'", 
    "#' @param y Some R object that doesn't matter.",
    "#' @return \\code{FALSE}.",
    "#' @references \\url{http://www.something.com/}",
    "#' @author Janko Thyson \\email{john.doe@@something.com}",
    "#' @seealso \\code{\\link{foo1}}",
    "#' @example inst/examples/foo2.R"
)

dir.create("src/functions", recursive=TRUE, showWarnings=FALSE)
dir.create("src/conso", recursive=TRUE, showWarnings=FALSE)

write(roxy.1, file="src/functions/foo1.R")
write(deparse(foo1), file="src/functions/foo1.R", append=TRUE)
write(roxy.2, file="src/functions/foo2.R")
write(deparse(foo2), file="src/functions/foo2.R", append=TRUE)

合并函数

consolidateThis <- function(
    path="src/functions",
    path.conso="src/conso/src_functions.R",
    rgx.roxy="^(#' ?|##' ?)(\\w*|@|$)",
    do.overwrite=TRUE,
    do.roxygen=TRUE,
    ...
) {
    if (!file.exists(path)) {
        stop("Check your 'path' argument")
    }
    files <- list.files(path, full.names=TRUE)
    if (do.overwrite) {
        file.create(path.conso)
    }
    sapply(files, function(ii) {
        this <- readLines(con=ii, warn=FALSE)
        code <- base::parse(text=this)
        if (do.roxygen) {     
            idx.roxy <- grep(rgx.roxy, this)
            if (length(idx.roxy)) {
                if (length(idx.roxy) == 1) {
                    stop("Weird roxygen code (just a one-liner)") 
                }
                bench <- seq(from=idx.roxy[1], max(idx.roxy))
                if (!all(bench %in% idx.roxy)) {
                    stop("Breaks in your roxygen code. Possibly detected comments that aren't roxygen code")
                }
                code.roxy <- this[idx.roxy]
                write(code.roxy, file=path.conso, append=TRUE)
            }
        }
        write(c(deparse(code[[1]]), ""), file=path.conso, append=TRUE)
    })
    return(path.conso)
}

应用函数

path <- consolidateThis()
> path
[1] "src/conso/src_functions.R"

所以现在有一个包含整合代码的源文件 'src/conso/src_functions.R'

【问题讨论】：

标签： r roxygen2 roxygen consolidation

【解决方案1】：

您是否特别需要解析（然后再解析）函数的源代码？如果没有，您可以大大简化代码。

以下产生与ConsolidateThis() 完全相同的输出。

ConsolidateThis2 <-
function(path="src/functions",
         path.conso="src/conso/src_functions.R",
         overwrite = TRUE) {
    if(overwrite) cat("", file = path.conso) # Blank out the file's contents

    ## A function to append infile's contents to outfile and add 2 <RET>          
    prettyappend <- function(infile, outfile) {
        file.append(outfile, infile)
        cat("\n\n", file = outfile, append = TRUE)
    }

    ## Append all files in 'path.conso' to file 'path'
    sapply(dir(path, full.names=TRUE), prettyappend, path.conso)
}

ConsolidateThis2()

【讨论】：

不错，谢谢！我必须仔细检查解析 - 乍一看，这似乎是正确的，这并不是真正必要的。但我可能是从需要这样做的上下文中得出的。再次感谢！
您是否知道如何让解析器区分实际代码和 cmets，并将 cmets 保留为属性？我很确定那里有一个解析函数可以做到这一点，但我似乎再也找不到它了。这很方便的原因是我的合并例程嵌套在一个函数中，该函数调查 S4 参考类的类依赖关系，以便为采购提供有效的排序规则。为此，我需要解析 defs。
@Rappster -- 是的，我有类似的印象，可能在 knitr / parser / formatR 的某个地方 包集群。我先看看formatR::tidy.source()。这里的例子表明它显然是“评论感知”的：github.com/yihui/formatR/wiki