【发布时间】:2015-03-21 19:09:43
【问题描述】:
这基本上是我之前提出的这个问题的后续。 链接是:
http://stackoverflow.com/questions/28115272/how-can-i-accomplish-parallel-processing-in-r
现在,代码,即:
library( doParallel )
cl <- makeCluster( 2 ) # for 2 processors, i.e. 2 parallel chains
registerDoParallel( cl )
datalist <- list(mydataset1 , mydataset2)
# now start the chains
nchains <- 2 # for two processors
results_list <- foreach(i=1:nchains ,
.packages = c( 'packages_you_need') ) %dopar% {
result <- find.string( datalist[[i]] )
return(result) }
当datalist 包含 2 个简单的字符串时,这似乎工作得很好,例如,
datalist <- list("abcabcabc","adcadcadc")
但是,如果我合并两个实际数据集,每个数据集都包含多行字符串,例如,
Dataset1:
abcabcabc
adcadcadc
aecaecaec
afcafcafc
.........
Dataset2:
xyzxyzxyz
xzcxzcxzc
xtcxtcxtc
xdcxdcxdc
.........
如果我有这样的数据集,那么这会产生一个错误:
Error in { : task 1 failed - "'to' must be of length 1"
关于为什么会发生这种情况或如何删除它的任何建议?
谢谢!
编辑:
str(datalist) - List of 2
$ : chr [1:3631] "000000000fbff000ff0000f00000" "000000000000fffffffffff0f000" "bb0bb00000f000000000bfff0000" "00b0b000bfbffffbffbf0ff00000" ...
$ : chr [1:3631] "000000000srst000tt0000t00000" "000000000000ttttttttttt0r000" "ss0tt00000q000000000sstt0000" "00s0q000ssqtsstrstss0ss00000" ...
dput(head(datalist))
"00000000r0t0st0000p000000000", "00000ssssttstssttts000000000",
"000000000r00sq000tp000000000", "0000000000tsq0sq0qt000000000",
"000q0000r00000000rss00000000", "00000000ttttttttttt000000000",
"0000000000s0qs000s0000000000", "000000ppqppqsrrrsr0000000000",
"00000r00s0t00ss00st000000000", "0000000000s000s0tt0000000000",
"00000s0000ttstq000t000000000", "0000000000qrs0t0s00t00000000",
"000000000s000stt0t0000000000", "0000000000qtr0000t0000000000",
"0000000000rrsrsqrr0000000000", "0000000000tsp0s000s000000000",
..............................................................
Edit2:每个数据集中 4 个元素的示例。
str(datalist)
List of 2
$ : chr [1:4] "000000000fbff000ff0000f00000" "000000000000fffffffffff0f000" "bb0bb00000f000000000bfff0000" "00b0b000bfbffffbffbf0ff00000"
$ : chr [1:4] "000000000srst000tt0000t00000" "000000000000ttttttttttt0r000" "ss0tt00000q000000000sstt0000" "00s0q000ssqtsstrstss0ss00000"
dput(head(datalist))
list(c("000000000fbff000ff0000f00000", "000000000000fffffffffff0f000",
"bb0bb00000f000000000bfff0000", "00b0b000bfbffffbffbf0ff00000"
), c("000000000srst000tt0000t00000", "000000000000ttttttttttt0r000",
"ss0tt00000q000000000sstt0000", "00s0q000ssqtsstrstss0ss00000"
))
【问题讨论】:
-
您可以编辑您的问题并将
str(datalist)和dput(head(datalist))的结果粘贴到其中吗?这将使故障排除变得更加容易。 -
我已经做到了。 :)
-
不要截断
dput- 我将复制它并使用它进行测试,所以我需要整个东西。 -
我不会这样做,但它包含大约 3631 行,很难在此处附加。
-
dput(head(datalist))只会给出每个部分的几行,这就是head的功能。但是,如果您的意思是有 3631 个列表元素,请将其子集为 3-4 个元素,然后dput(head...))。谢谢。更新:你不需要子集,我看到元素本身是 3631 长。
标签: r multithreading parallel-processing