【问题标题】：R How to call a function on its own outputR如何在自己的输出上调用函数
【发布时间】：2015-12-11 17:34:16
【问题描述】：

情况是这样的。我有一个data.table，我想根据某些标准折叠该表的行。我写了一个函数，但它一次只能操作 2 行。所以我可以用一轮函数折叠的最多行是 50%（即从 1000 行开始，第 1 轮函数给我们留下 500 行）。在这一点上，似乎合乎逻辑的做法是在产生的输出上再次运行该函数，以便我可以进一步折叠行，然后再次执行此操作，直到我折叠所有可能的行。

我的功能

fun = function(x) {
<stuff the function does>
return(output) }

我想在其自己的输出上调用该函数，并重复此操作，直到输出不再因进一步的函数调用而改变。

我试过这个：

fun = function(x) {
<stuff>
output = resulting_dt
while (!identical(x,output)) {fun(output)}
return(output)
}

但这给了我一个错误：

Error in eval(expr, envir, enclos) : object '__' not found

我确信有一种方法可以完成这项工作，但我对 R 编程相当陌生，这是我必须编写的第一个真正的程序，因此非常感谢任何帮助或建议！

编辑：我采用了@gregor 和@42- 提供的解决方案的组合

fun = function(x) {
<stuff>
output = resulting_dt
if (!identical(x,output)) {return(Recall(output))}
else {return(output)}
}

对于那些对可重现函数感兴趣的人，我在想出一个函数时遇到了麻烦（它花了很长时间），所以这是我实际使用的一个丑陋的函数：

fun <- function (object)
{
  num = 1
  n = 1
  temp = list()
  while (n <= object[, length(chr)]) 
    { 
    if ( (n == (object[, length(chr)])) &&
         !( (object[n,chr] == object[n-1,chr]) && 
            (abs(object[n,end] - object[n-1,start]) < 500) &&
            (((object[n,meth.diff] >= 0) == (object[n-1,meth.diff] >= 0)) || 
             ((object[n,meth.diff] < 0) == (object[n-1,meth.diff] < 0)))))
         {
         x = data.table(
           chr=object[n,chr], start=object[n,start], end=object[n,end], 
           meth.diff=object[n,meth.diff], mean_KO=object[n,mean_KO], 
           mean_WT=object[n,mean_WT], coverage_KO=object[n,coverage_KO], 
           coverage_WT=object[n,coverage_WT]
         )
         temp[[num]] = x
         n = n + 1 
         num = num + 1
         }
    else if ( (object[n,chr] == object[n+1,chr]) && 
          (abs(object[n,end] - object[n+1,start]) < 500) &&
          (((object[n,meth.diff] >= 0) == (object[n+1,meth.diff] >= 0)) || 
             ((object[n,meth.diff] < 0) == (object[n+1,meth.diff] < 0)))) 
      {
      x = data.table(
        chr=object[n,chr], start=object[n,start], end=object[n+1, end], meth.diff= mean(c(object[n,meth.diff], object[n+1,meth.diff])), 
        mean_KO=(((object[n,mean_KO] * object[n,coverage_KO])/(object[n,coverage_KO] + object[n+1,coverage_KO])) +
                   ((object[n+1,mean_KO] * object[n+1,coverage_KO])/(object[n,coverage_KO] + object[n+1,coverage_KO]))),
        mean_WT=(((object[n,mean_WT] * object[n,coverage_WT])/(object[n,coverage_WT] + object[n+1,coverage_WT])) +
                   ((object[n+1,mean_WT] * object[n+1,coverage_WT])/(object[n,coverage_WT] + object[n+1,coverage_WT]))),
        coverage_KO=(object[n,coverage_KO] + object[n+1,coverage_KO]),
        coverage_WT=(object[n,coverage_WT] + object[n+1,coverage_WT])
        )
      x[, meth.diff := (mean_KO - mean_WT) ]
      temp[[num]] = x
      n = n + 2
      num = num + 1
      }
    else 
      {
        x = data.table(
          chr=object[n,chr], start=object[n,start], end=object[n,end], 
          meth.diff=object[n,meth.diff], mean_KO=object[n,mean_KO], 
          mean_WT=object[n,mean_WT], coverage_KO=object[n,coverage_KO], 
          coverage_WT=object[n,coverage_WT]
          )
        temp[[num]] = x
        n = n + 1 
        num = num + 1
      }
  }
  result = rbindlist(temp)
  #print(result)
  while (!identical(object,result)){fun(result)}
  else {return(result)}
}

AND 示例输入 data.table：

library(data.table)
dt = structure(list(chr = c("chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1", "chr1", "chr1", "chr1"), start = c(842326, 855423, 
855426, 855739, 855771, 880164, 880182, 880262, 1005284, 1005315
), end = c(842327L, 855424L, 855427L, 855740L, 855772L, 880165L, 
880183L, 880263L, 1005285L, 1005316L), meth.diff = c(9.35200555410902, 
19.1839617944039, 29.6734426495636, -12.3375577709254, 4.21809779410175, 
50.539925536006, 28.0168014922334, 35.1349192165154, 16.8742940741475, 
62.6063420676512), mean_KO = c(9.35200555410902, 19.1839617944039, 
32.962962583692, 1.8512250859083, 4.44417336983763, 67.0864799025607, 
31.1083297690512, 49.5746020684321, 25.1985773481452, 78.6766354515961
), mean_WT = c(0, 0, 3.28951993412841, 14.1887828568337, 0.226075575735883, 
16.5465543665547, 3.09152827681786, 14.4396828519167, 8.32428327399768, 
16.0702933839448), coverage_KO = c(139L, 55L, 55L, 270L, 270L, 
55L, 55L, 238L, 526L, 499L), coverage_WT = c(120L, 86L, 87L, 
444L, 442L, 116L, 115L, 362L, 649L, 647L)), .Names = c("chr", 
"start", "end", "meth.diff", "mean_KO", "mean_WT", "coverage_KO", 
"coverage_WT"), class = c("data.table", "data.frame"), row.names = c(NA, 
-10L))

以及我想要的输出示例（为了后代，因为它与这个问题并不完全相关）

library(data.table)
dt1 = structure(list(chr = c("chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1"), start = c(842326, 855423, 855739, 855771, 880164, 1005284
), end = c(842327L, 855427L, 855740L, 855772L, 880263L, 1005316L
), meth.diff = c(9.35200555410902, 24.4191949389371, -12.3375577709254, 
4.21809779410175, 36.7726824955192, 39.0419497750433), mean_KO = c(9.35200555410902, 
26.073462189048, 1.8512250859083, 4.44417336983763, 49.4237638627169, 
51.2332612443618), mean_WT = c(0, 1.65426725011082, 14.1887828568337, 
0.226075575735883, 12.6510813671977, 12.1913114693185), coverage_KO = c(139L, 
110L, 270L, 270L, 348L, 1025L), coverage_WT = c(120L, 173L, 444L, 
442L, 593L, 1296L)), .Names = c("chr", "start", "end", "meth.diff", 
"mean_KO", "mean_WT", "coverage_KO", "coverage_WT"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"))

【问题讨论】：

请包含一个真实的函数而不是伪代码。
while 将由递归自动处理。将 while 行替换为 if (!identical(x, output)) return(fun(output))。出于调试目的，您可能需要输入打印语句或其他内容来检查它的深度。
函数很长，可以吗？我还可以提供示例输入/输出数据，我只是认为没有这些东西也可以回答这个问题。你怎么看？
我认为我的评论中的回答会起作用，但它假设其他一切都按预期工作。提出问题的理想方法是创建一个最小的工作示例 - 可以说明问题并且可以运行但忽略与问题的特定部分正交的复杂性。
@Gregor 我尝试了您的解决方案，包括打印语句帮助很大。该函数实际上是在循环，但事实证明它每次都离开了 data.table 的最后一行，因此每个循环都使表更短，直到在最后一个循环中没有更多行可以使用。 - 看来我需要更改功能才能使其正常工作。感谢您的帮助！

标签： r for-loop while-loop data.table

【解决方案1】：

这是?Recall中的例子

fib <- function(n)
   if(n<=2) { if(n>=0) 1 else 0 } else Recall(n-1) + Recall(n-2)

> fib(10)
[1] 55

【讨论】：

谢谢，这很好。我现在正在使用 Recall 来“调用”我的函数。

【解决方案2】：

这听起来像是 returnValue 和 on.exit 函数的任务。您没有提供可重复的示例，所以我不会为您想象一个。
returnValue 的工作示例可以在我的 dtq 包中找到。它用于记录任意函数返回的对象的行数。
returnValue必须在on.exit中使用，请参阅dtq/R/zzz.R#L23：

on.exit(
    dtq.local.log(
        timestamp = Sys.time(),
        end = if(isTRUE(getOption("dtq.log.nano")) && requireNamespace("microbenchmark", quietly=TRUE)) microbenchmark::get_nanotime()*1e-9 else proc.time()[[3L]],
        out_rows = as.integer(nrow(returnValue()))[1L]
    )
)
# hint on: as.integer(.)[1L]
# is used to return NA for object for which `nrow` will not make sense
# so to ensure length 1 integer type object is returned

请注意，此功能是最近在 base R 中引入的，因此如果您使用的是旧版本的 R，则可能找不到它。

【讨论】：

谢谢！ returnValue 函数看起来真的很有用。我会调查的。

**编辑：我采用了@gregor 和@42- ** 提供的解决方案的组合

编辑：我采用了@gregor 和@42- 提供的解决方案的组合