替换双 for 循环以提高速度答案

【问题标题】：Replacing a double for loop to increase speed替换双 for 循环以提高速度
【发布时间】：2016-07-01 22:12:43
【问题描述】：

我正在估计条件边际密度并在新的观察中评估它们。然后我将估计值输入到一个数组中。这段代码很慢，我无法显着加快速度。任何帮助深表感谢。这是一个可重现的小例子：

library(sm)

y <- rep(1:6, 30)
K <- length(unique(y))
X <- matrix(rnorm(180 * 1000), nrow=180)
newx <- matrix(rnorm(20 * 1000), nrow=20)

f.estimates <- array(dim=c(dim(newx)[1], dim(X)[2], K - 1))
g.estimates <- array(dim=c(dim(newx)[1], dim(X)[2], K - 1))
for(k in 1:(K - 1)) {
  for(j in 1:dim(X)[2]) {
    f.estimates[, j, k] <- sm.density(X[y <= k, j], 
                              eval.points=newx[, j], 
                              display="none")$estimate
    g.estimates[, j, k] <- sm.density(X[y > k, j], 
                              eval.points=newx[, j],
                              display="none")$estimate
  }
}

【问题讨论】：

你可以用两个 sapply 函数替换你的内部循环。这可能会使性能略有提高，大约为 0.2。看一下，您可能需要转置生成的矩阵。
您的问题也可以并行完成。这是一个很好的参考。查看foreach。这是一个很好的资源r-bloggers.com/how-to-go-parallel-in-r-basics-tips。

标签： r performance optimization

【解决方案1】：

设置：

library(sm)

y <- rep(1:6, 30)
K <- length(unique(y))
X <- matrix(rnorm(180 * 1000), nrow=180)
newx <- matrix(rnorm(20 * 1000), nrow=20)

f.estimates <- array(dim=c(dim(newx)[1], dim(X)[2], K - 1))
g.estimates <- array(dim=c(dim(newx)[1], dim(X)[2], K - 1))

使用plyr：

library(plyr)
cond <- expand.grid(k=1:(K-1), j=1:dim(X)[2]) #conditions, to avoid multiple **ply loops

f.estimates <- aaply(cond, 1, function(c) sm.density(X[y <= c[,1], c[,2]], 
                                                 eval.points=newx[, c[,2]], 
                                                 display="none")$estimate)
f.estimates <- aperm(f.estimates, c(3,2,1))

g.estimates <- aaply(cond, 1, function(c) sm.density(X[y > c[,1], c[,2]], 
                                                 eval.points=newx[, c[,2]], 
                                                 display="none")$estimate)
g.estimates <- aperm(g.estimates, c(3,2,1))

使用aperm() 转换数组维度的顺序，就像t() 对矩阵所做的那样。

【讨论】：