【问题标题】:Parallel Processing for my for loop function in RR中我的for循环函数的并行处理
【发布时间】:2020-12-25 17:13:14
【问题描述】:

我在R 中有这个for loop,它在所有与auto-set.seed 顺序完全相等(1、0、0)的系数中搜索了我的ARIMA(1, 0, 0) that is exactly 0.950` 的系数。

library(forecast)
library(dplyr)
arima_order_results = data.frame()

seed_out1 <- c(14,16,20,29,50,51,53,55,56,59,64,71,77,95,98,106,110,115,120,126,174,175,187,214,216,256,257,265,266,268,283,286,293,301,309,311,318,320,346,349,356,363,374,376,378,379,396,397,416,422,427,445,446,448,452,453,458,466,469,470,471,475,480,501,505,506,524,539,540,559,564,566,567,573,574,579,589,593,625,626,634,640,643,647,674,676,678,679,680,687,688,689,704,711,712,727,738,742,746,747,781,783,784,814,816,832,847,859,860,865,880,894,898,902,906,918,920,924,926,929,936,939,941,945,949,960,961,975)

for (my_seed in seed_out1){
  set.seed(my_seed)
  ar1 <- arima.sim(n = 15, model=list(ar = 0.95, order = c(1, 0, 0)), sd = 1)
  ar2 <- auto.arima(ar1, ic = "aicc")
  arr <- as.data.frame(t(ar2$coef))
  if(substr(as.character(arr[1]), 1, 5) == "0.950") {
    arr <- cbind(data.frame(seed=my_seed),arr)
    print(arr)
    arima_order_results = bind_rows(arima_order_results,arr)

  }
}

R 代码运行良好,但我想让它在并行进程上运行。

我在 Windows 上工作

【问题讨论】:

    标签: r search parallel-processing arima


    【解决方案1】:

    一个选项是parallelforeach

    library(parallel)    
    library(foreach)
    
    library(forecast)
    library(dplyr)
    library(doSNOW)
    
    n <-  parallel::detectCores()
    cl <- parallel::makeCluster(n, type = "SOCK")   
    doSNOW::registerDoSNOW(cl)
    
    seed_out1 <- c(14,16,20,29,50,51,53,55,56,59,64,71,77,95,98,106,110,115,120,126,174,175,187,214,216,256,257,265,266,268,283,286,293,301,309,311,318,320,346,349,356,363,374,376,378,379,396,397,416,422,427,445,446,448,452,453,458,466,469,470,471,475,480,501,505,506,524,539,540,559,564,566,567,573,574,579,589,593,625,626,634,640,643,647,674,676,678,679,680,687,688,689,704,711,712,727,738,742,746,747,781,783,784,814,816,832,847,859,860,865,880,894,898,902,906,918,920,924,926,929,936,939,941,945,949,960,961,975)
    
    
    lst_out <- foreach::foreach(i = seq_along(seed_out1), 
                      .packages = c("dplyr", "forecast") ) %dopar% {
    
        my_seed <- seed_out1[i]
        set.seed(my_seed)
        ar1 <- arima.sim(n = 15, model=list(ar = 0.95, order = c(1, 0, 0)), sd = 1)
        ar2 <- auto.arima(ar1, ic = "aicc")
       
    
         arr <- as.data.frame(t(ar2$coef))
        if(substr(as.character(arr[1]), 1, 5) == "0.631") {
          
           arr <- cbind(data.frame(seed=my_seed),arr)
           
    
         }
        
       return(arr)
     }
    parallel::stopCluster(cl)
    

    然后绑定list元素并移除NA

    out <-  lst_out %>%
                keep(~ length(.) > 0) %>%
                bind_rows
    
    out %>%
           filter(complete.cases(.))
    
    out
    #        ar1 intercept seed
    #1 0.6318504   3.30278  427
    

    【讨论】:

    • 它不会搜索出ar2的系数0.9500,而是打印每个128系列的系数及其种子。
    • @DanielJames 确实如此,但在输出中,没有0.950 的情况,因此它没有创建该列。 bind_rows 如果有新列,则自动追加一个
    • @DanielJames。您可以将该值更改为0.631,即if(substr(as.character(arr[1]), 1, 5) == "0.631"),然后会有3列
    • 对不起,是0.950。即便如此,我希望它仅打印出系数为0.950 的种子编号,并排除不在128 种子之外的那个。如果0.950 不存在,它应该什么也不打印。
    • @DanielJames 你能检查我的更新吗?我将 0.950 替换为 0.631 为 0.950 没有案例
    猜你喜欢
    • 1970-01-01
    • 2012-01-28
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-04-18
    • 2011-01-31
    • 2017-07-01
    相关资源
    最近更新 更多