【发布时间】:2015-12-14 17:31:21
【问题描述】:
我编写了一个带有 3 个嵌套 foreach 循环的函数,并行运行。该函数的目标是将 30 个 [10,5] 矩阵(即[[30]][10,5])的列表拆分为 5 个 [10,30] 矩阵(即[[5]][10,30])的列表。
但是,我尝试使用 1,000,000 个路径(即foreach (m = 1:1000000))运行此函数,显然,性能很糟糕。
如果可能,我想避免应用函数,因为我发现它们在与并行 foreach 循环结合使用时效果不佳:
library(foreach)
library(doParallel)
# input matr: a list of 30 [10,5] matrices
matrix_splitter <- function(matr) {
time_horizon <- 30
paths <- 10
asset <- 5
security_paths <- foreach(i = 1:asset, .combine = rbind, .packages = "doParallel", .export = "daily") %dopar% {
foreach(m = 1:paths, .combine = rbind, .packages = "doParallel", .export = "daily") %dopar% {
foreach(p = daily, .combine = c) %dopar% {
p[m,i]
}
}
}
df_securities <- as.data.frame(security_paths)
split(df_securities, sample(rep(1:paths), asset))
}
总的来说,我正在尝试转换这种数据格式:
[[30]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.2800977 2.06715521 0.9196326 0.3560659 1.36126507
[2,] -0.5119867 0.24329025 0.1513218 -1.2528092 -0.04795098
[3,] -2.0293933 -1.17989270 0.3053376 -0.9528611 0.86758140
[4,] -0.6419024 -0.24846720 -0.6640066 -1.7104961 -0.32759406
[5,] -0.4340359 -0.44034013 3.3440507 0.7380613 2.01237069
[6,] -0.6679914 -0.01332117 1.9286056 -0.7194116 0.15549978
[7,] 0.5919820 0.11616685 -0.8424634 -0.7652715 1.34176688
[8,] 0.8079152 0.40592119 -0.4291811 0.9358829 -0.97479314
[9,] -0.0265207 -0.03598320 1.1287344 0.4732984 1.37792596
[10,] 1.0553966 0.65776721 -1.2833613 -0.2414846 0.81528686
到这种格式(显然到 V30):
$`5`
V1 V2 V3 V4 V5 V6 V7
result.2 -0.11822260 1.7712833 1.97737285 -1.6643193 0.4788075 1.2394064 1.4800787
result.7 -1.23251178 0.4267885 -0.07728632 0.3463092 0.8766395 0.6324840 0.5946710
result.2.1 -1.27309457 -0.3128173 -0.79561297 -0.4713307 -0.4344864 0.4688124 -0.5646857
result.7.1 0.51702719 -1.6242650 -2.37976199 -0.1088408 0.4846507 -0.7594376 0.9326529
result.2.2 1.77550390 0.9279155 0.26168402 0.4893835 1.4131326 0.5989508 -0.3434010
result.7.2 -0.01590682 -0.5568578 1.35789122 -0.1385092 -0.4501515 -0.2581724 0.5451699
result.2.3 0.30400225 -1.0245640 -0.05285694 -0.1354228 0.3070331 -0.7618850 1.0330961
result.7.3 -0.08139912 0.4106541 1.40418839 0.2471505 1.2106539 1.3844721 0.4006751
result.2.4 0.94977544 -0.8045054 1.48791211 1.4361686 -0.3789274 -1.9570125 -1.6576634
result.7.4 0.70449194 1.6887800 0.56447340 0.6465640 2.6865388 -0.7367524 0.6242624
V8 V9 V10 V11 V12 V13
result.2 -0.432404728 -1.6225350 0.09855465 0.17371907 0.3081843 0.15148452
result.7 -0.597420706 0.6173004 0.07518596 2.01741406 0.1767152 -0.39219471
result.2.1 0.918408322 -1.6896424 -0.13409626 0.38674224 0.3491750 -1.61083286
result.7.1 2.564057340 -0.7696399 1.06103614 1.38528367 1.1684045 -0.08467871
result.2.2 0.951995816 0.1910284 1.79943500 2.13909498 0.2847664 0.31094568
result.7.2 -0.479349220 -0.2368760 0.04298525 -0.40385960 0.3986555 -1.93499213
result.2.3 -1.382370069 1.0459845 -0.33106323 -0.43362925 0.7045572 -0.30211601
result.7.3 -1.457106442 0.1487447 -2.52392942 -0.02399523 -1.0349746 0.87666365
result.2.4 -0.848879365 0.7521024 0.16790915 0.47112444 0.8886361 -0.12733039
result.7.4 -0.003350467 0.4021858 -1.80031445 -1.42399232 1.0507765 -0.36193846
【问题讨论】:
-
你想如何重新排列?在您的示例中,输出中没有输出图。
-
真的只是从
[[30]][10,5]到[[5]][10,30] -
我根本没有找到任何非常清楚的解释,但我怀疑您可能会发现包(和函数)abind 很有帮助,然后是函数 @987654329 @.
-
性能是否受到并行开销的影响?你基本上什么都不做,并行调用它。除此之外,据我所知,您是否要将
[[30]][1000000,5]更改为[[5]][1000000,30]?
标签: r matrix foreach parallel-processing dataframe