【发布时间】:2019-04-15 14:27:25
【问题描述】:
我正在尝试从 RStudio 连接到 spark。目前我们正在使用运行 Spark (2.2) 的 Cloudera Hadoop 发行版。我测试了边缘节点的所有内容,我能够创建 Spark 上下文并执行我的查询。直到昨天,RStudio 一切正常,突然我们遇到了 RStudio 的问题。
library(dplyr)
library(sparklyr)
config <- spark_config()
config$spark.driver.memory <- "8G"
config$spark.executor.memory <- "8G"
config$spark.executor.executor <- "2"
config$spark.executor.cores <- "4"
config$spark.kryoserializer.buffer.max <- "2000m"
config$spark.driver.maxResultSize <- "4G"
config$spark.akka.frameSize <- "768"
sc <- spark_connect(master="yarn-client",
version="2.2.0",
config=config,
spark_home = '/opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2')
Error in force(code) :
Failed while connecting to sparklyr to port (8880) for sessionid (14727): Sparklyr gateway did not respond while retrieving ports information after 60 seconds
Path: /opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2/bin/spark-submit
Parameters: --class, sparklyr.Shell, '/usr/lib64/R/library/sparklyr/java/sparklyr-2.2-2.11.jar', 8880, 14727
Log: /tmp/RtmpoNJQEH/file151b437c0313b_spark.log
---- Output Log ----
18/11/12 13:54:50 INFO sparklyr: Session (14727) is starting under 127.0.0.1 port 8880
18/11/12 13:54:50 INFO sparklyr: Session (14727) found port 8880 is not available
18/11/12 13:54:50 INFO sparklyr: Backend (14727) found port 8884 is available
18/11/12 13:54:50 INFO sparklyr: Backend (14727) is registering session in gateway
18/11/12 13:54:50 INFO sparklyr: Backend (14727) is waiting for registration in gateway
---- Error Log ----
我也验证了 sparklyr 的版本,它是 0.9.2
请问有什么问题可以告诉我吗?
【问题讨论】:
标签: r apache-spark rstudio sparklyr rstudio-server