用于循环计算模型预测的标准误差的函数答案

【问题标题】：Function to loop calculation of standard errors for model predictions用于循环计算模型预测的标准误差的函数
【发布时间】：2021-12-24 10:57:45
【问题描述】：

我正在寻求一些帮助来循环我的代码或为我需要的计算创建一个函数。

我的数据框如下。除了 newdat2$time 之外，所有列在每一行中都重复相同的值，它的值每行都会发生变化：

newdat2 <- data.frame(season = rep("Summer", 31), 
                      time = seq(0, 3, by = 0.1), 
                      temp = rep(21.79384, 31),
                      last.rain.bom = rep(4.232604, 31),
                      rain = rep(0.916501, 31),
                      wind = rep("nil", 31),
                      cloud = rep(40.20378, 31),
                      abundance = rep(117.6262, 31),
                      site = rep("Avalon", 31))

对于这个数据框的每一行，我想完成下面的计算。此计算是计算拟合模型预测的标准误差，请参阅here。

C = c(0,0,0,0,0,0,0.0,0,0,0,0,0,0, 0, 1,0,0,0,time,21.8,4.23,0.917,0,0,0,40.2,4.78) # This represents covariate values of my fitted model. The value of time needs to change for each row of newdat2$time, all other values remain the same
s <- vcov(zib) # zib is my fitted model and this row of code is taking the variance covariance matrix of my fitted model. s is a matrix 27x27
newdat2$se <- sqrt(t(C) %*% s %*% C) # This then calculates the standard errors for my model predictions but C must change for each row of newdat2 to reflect the change in newdat2$time

例如，循环/函数完成的第一个计算是

C = c(0,0,0,0,0,0,0.0,0,0,0,0,0,0, 0, 1,0,0,0,0,21.8,4.23,0.917,0,0,0,40.2,4.78) # 0 is the first value of newdat2$time
s <- vcov(zib) 
newdat2$se <- sqrt(t(C) %*% s %*% C)

循环/函数完成的第二次计算是

C = c(0,0,0,0,0,0,0.0,0,0,0,0,0,0, 0, 1,0,0,0,0.1,21.8,4.23,0.917,0,0,0,40.2,4.78) # 0.1 is the second value of newdat2$time
s <- vcov(zib) 
newdat2$se <- sqrt(t(C) %*% s %*% C)

循环/函数完成的第三次计算是

C = c(0,0,0,0,0,0,0.0,0,0,0,0,0,0, 0, 1,0,0,0,0.2,21.8,4.23,0.917,0,0,0,40.2,4.78) # 0.2 is the third value of newdat2$time
s <- vcov(zib) 
newdat2$se <- sqrt(t(C) %*% s %*% C)

非常感谢任何帮助循环这样的计算或创建一个可以实现这一点的函数。

【问题讨论】：

标签： r function

【解决方案1】：

我这里没有数据或预期结果，但这应该可以：
这个想法是将向量C 的所有版本制作成一个矩阵，然后用它进行计算。您只需要结果答案的对角线元素，所以我认为colSums(m * s %*% m) 会给出相同的答案，但会更快。

C = c(0,0,0,0,0,0,0.0,0,0,0,0,0,0, 0, 1,0,0,0,0,21.8,4.23,0.917,0,0,0,40.2,4.78)
m <- matrix(rep(C, length(newdat2$time)), ncol = length(newdat2$time))
m[19, ] <- newdat2$time
s <- vcov(zib)
newdat2$se <- sqrt(colSums(m * s %*% m))

这应该比循环更快。

【讨论】：

对不起，我的意思是在最后一行使用 m，而不是 C。现在已修复。
谢谢你这个工作:) 你是对的，它比使用for() 循环更快

【解决方案2】：

通过循环，你可以这样做：

newdat<-NULL
for(i in 1:length(newdat2$time))
{
    C = c(0,0,0,0,0,0,0.0,0,0,0,0,0,0, 0, 1,0,0,0,newdat2$time[i],21.8,4.23,0.917,0,0,0,40.2,4.78)
    s <- vcov(zib)
    newdat<-c(newdat,sqrt(t(C) %*% s %*% C))
}

现在您只需将 newdat 向量添加到数据框即可。但是，我同意上面@Brian 的观点，与他建议的矢量化方法相比，这个方法要慢。

【讨论】：