【发布时间】:2018-09-07 19:18:10
【问题描述】:
我正在尝试拟合一个有限混合模型,每个类别的混合模型都是神经网络。能够并行化对我来说非常有用,因为 keras 不会最大化我笔记本电脑上的所有可用内核,更不用说大型集群了。
但是当我尝试为不同的模型设置不同的学习率时,在一个并行的 foreach 循环中,整个事情就窒息了。
发生了什么事?我怀疑它与范围有关——也许工作人员没有在 tensorflow 的单独实例上运行。但我真的不知道。我怎样才能使这项工作?我需要了解什么才能知道为什么这不起作用?
这是一个 MWE。将foreach 循环设置为%do%,它工作正常。将其设置为%dopar%,它就会在拟合阶段窒息。
library(foreach)
library(doParallel)
registerDoParallel(2)
library(keras)
library(tensorflow)
mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y
x_train <- array_reshape(x_train, c(nrow(x_train), 784))
x_test <- array_reshape(x_test, c(nrow(x_test), 784))
# rescale
x_train <- x_train / 255
x_test <- x_test / 255
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)
# make tensorflow run single-threaded
session_conf <- tf$ConfigProto(intra_op_parallelism_threads = 1L,
inter_op_parallelism_threads = 1L)
# Create the session using the custom configuration
sess <- tf$Session(config = session_conf)
K <- backend()
K$set_session(sess)
models <- foreach(i = 1:2) %dopar%{
model <- keras_model_sequential()
model %>%
layer_dense(units = 256/i, activation = 'relu', input_shape = c(784)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 128/i, activation = 'relu') %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 10, activation = 'softmax')
print("A")
model %>% compile(
loss = 'categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = c('accuracy')
)
print("B")
history <- model %>% fit(
x_train, y_train,
epochs = 3, batch_size = 128,
validation_split = 0.2, verbose = 0
)
print("done")
}
这里是sessionInfo():
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] splines parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] panelNNET_1.0 matrixStats_0.54.0 MASS_7.3-50 lfe_2.8-2 tensorflow_1.9 keras_2.1.6.9005
[7] mgcv_1.8-24 nlme_3.1-137 scales_1.0.0 forcats_0.3.0 stringr_1.3.1 purrr_0.2.5
[13] readr_1.1.1 tidyr_0.8.1 tibble_1.4.2 tidyverse_1.2.1 maptools_0.9-3 rgeos_0.3-28
[19] rgdal_1.3-4 sp_1.3-1 broom_0.5.0 ggplot2_3.0.0 randomForest_4.6-14 dplyr_0.7.6
[25] glmnet_2.0-16 Matrix_1.2-14 doBy_4.6-2 doParallel_1.0.11 iterators_1.0.10 foreach_1.4.4
loaded via a namespace (and not attached):
[1] httr_1.3.1 jsonlite_1.5 modelr_0.1.2 Formula_1.2-3 assertthat_0.2.0 cellranger_1.1.0
[7] yaml_2.2.0 pillar_1.3.0 backports_1.1.2 lattice_0.20-35 glue_1.3.0 reticulate_1.10
[13] digest_0.6.15 RcppEigen_0.3.3.4.0 rvest_0.3.2 colorspace_1.3-2 sandwich_2.5-0 plyr_1.8.4
[19] pkgconfig_2.0.1 haven_1.1.2 xtable_1.8-2 whisker_0.3-2 withr_2.1.2 lazyeval_0.2.1
[25] cli_1.0.0 magrittr_1.5 crayon_1.3.4 readxl_1.1.0 xml2_1.2.0 foreign_0.8-70
[31] tools_3.5.1 hms_0.4.2 munsell_0.5.0 bindrcpp_0.2.2 compiler_3.5.1 rlang_0.2.2
[37] grid_3.5.1 rstudioapi_0.7 base64enc_0.1-3 labeling_0.3 gtable_0.2.0 codetools_0.2-15
[43] R6_2.2.2 tfruns_1.3 zoo_1.8-3 lubridate_1.7.4 zeallot_0.1.0 bindr_0.1.1
[49] stringi_1.2.4 Rcpp_0.12.18 tidyselect_0.2.4
【问题讨论】:
-
仅供参考,指定操作系统至关重要,最好是完整的
sessionInfo()。具体来说,doParallel::registerDoParallel(2)为不同的操作系统生成不同类型的集群。 -
是的,这是 Linux,谢谢。发布会话信息
-
不是 R 专家。也许您应该为每个并行运行创建一个 tf$Session?否则运行之间可能会发生冲突
-
@DanielGL 好主意,值得一试。但是你能在 python 上并行运行单线程 tensorflow 会话吗?
-
@DanielGL 你的想法很有效,回想起来应该很明显。如果您想要一些分数,请写一个简短的答案。
标签: r tensorflow parallel-processing scope keras