R 和数据表使用自定义函数创建新输出而不是 ddply答案

【问题标题】：R and datatable using customized function to create new output instead of ddplyR 和数据表使用自定义函数创建新输出而不是 ddply
【发布时间】：2020-07-03 01:25:02
【问题描述】：

我对使用 data.table 很陌生，但我需要优化大型模拟的后处理。我会使用 ddply 来获得具有基于自定义函数（estimate_AUC）的所需计算参数的新输出，该函数可以适应不同的列名（例如 TIME 和 Cc）和不同的计算方法（例如 last、inf 等）：

AUC_out <- plyr::ddply(sim, c("ID","Dose"), function(x) {
  out <- data.frame(AUCinf = estimate_AUC(Time = x$TIME,
                                          Conc = x$Cc,
                                          AUCtype = "inf"),
                    AUC48  = estimate_AUC(Time = x$TIME[x$TIME<=48],
                                          Conc = x$Cc[x$TIME<=48],
                                          AUCtype = "last")),
  Cc48  =approx(x$TIME,x$Cc,48)$y,
  stringsAsFactors = FALSE)
})

带模拟：

ID          Cc    TIME Dose
    1:   1 0.000000000    0.00  100
    2:   1 0.462881773    0.25  100
    3:   1 0.625713766    0.50  100
    4:   1 0.729046515    0.75  100
    5:   1 0.825169830    1.00  100
   ---

如何通过 data.table 使用自定义函数，同时能够在函数参数中提供方法、特定列名

dput(head(sim))
structure(list(ID = c(1, 1, 1, 1, 1, 1), DoseID = c(1L, 1L, 1L, 
1L, 1L, 1L), Dose = c(100, 100, 100, 100, 100, 100), nbrDoses = c(1, 
1, 1, 1, 1, 1), ExpID = c(1, 1, 1, 1, 1, 1), TrialID = c(1L, 
1L, 1L, 1L, 1L, 1L), IndivID = c(1L, 1L, 1L, 1L, 1L, 1L), USUBJID = c(11, 
11, 11, 11, 11, 11), TIME = c(0, 0.25, 0.5, 0.75, 1, 1.25), Cc = c(0, 
0.462881773273397, 0.625713765604934, 0.729046515431686, 0.825169830220163, 
0.92030770178198), PL = c(14.8635310605163, 14.8810310604533, 
14.8985310551099, 14.916031006317, 14.9335308009029, 14.9510302005905
), Eff = c(5.19411550856408e-19, 1.18067555547615e-08, 4.21253176904848e-07, 
2.63818207596035e-06, 9.25475212778715e-06, 2.43639651038346e-05
)), class = c("data.table", "data.frame"), row.names = c(NA, 
-6L), .internal.selfref = <pointer: 0x00000000045e1ef0>)

【问题讨论】：

您好，您可以包含生成sim 的代码吗？为此使用dput 并将结果粘贴到您的帖子中

标签： r function data.table plyr

【解决方案1】：

这是使用PKNCA 包的示例。由于只有一个剂量和一个ID，所以没有太多数据需要计算……

library(data.table)
library(PKNCA)
sim <- structure(list(ID = c(1, 1, 1, 1, 1, 1), 
                      DoseID = c(1L, 1L, 1L, 1L, 1L, 1L), Dose = c(100,100, 100, 100, 100, 100), nbrDoses = c(1, 1, 1, 1, 1, 1), 
                      ExpID = c(1, 1, 1, 1, 1, 1), 
                      TrialID = c(1L, 1L, 1L, 1L, 1L, 1L), 
                      IndivID = c(1L, 1L, 1L, 1L, 1L, 1L), 
                      USUBJID = c(11, 11, 11, 11, 11, 11), TIME = c(0, 0.25, 0.5, 0.75, 1, 1.25), 
                      Cc = c(0, 0.462881773273397, 0.625713765604934, 0.729046515431686, 0.825169830220163, 0.92030770178198), 
                      PL = c(14.8635310605163, 14.8810310604533, 14.8985310551099, 14.916031006317, 14.9335308009029, 14.9510302005905), 
                      Eff = c(5.19411550856408e-19, 1.18067555547615e-08, 4.21253176904848e-07, 2.63818207596035e-06, 9.25475212778715e-06, 2.43639651038346e-05)), class = c("data.table", "data.frame"), row.names = c(NA, -6L))
setDT(sim)
sim[, .(AUC.inf = pk.calc.auc(Cc, TIME, interval=c(0, Inf)),
        AUC.48 = pk.calc.auc(Cc, TIME, interval=c(0, 48)),
        Cc48 = approx(TIME, Cc, 48)$y
        ), by = c("ID", "Dose")]
#>    ID Dose   AUC.inf    AUC.48 Cc48
#> 1:  1  100 0.7757414 0.7757414   NA

^{由reprex package (v0.3.0) 于 2020 年 3 月 22 日创建}

【讨论】：

感谢您的帮助！这只是一个案例示例，但我还有许多其他自定义功能可以像这样使用。完美，我从这里明白了如何管理