【问题标题】:Spark application cannot run successfully on EMR with YARNSpark 应用程序无法在带有 YARN 的 EMR 上成功运行
【发布时间】:2019-02-14 16:49:40
【问题描述】:

我的 spark 应用程序在客户端模式下完美运行,主控 local[*] 在 EMR 和 yarn 模式下也在本地运行

Spark 提交命令:

spark-submit --deploy-mode cluster --master yarn \
    --num-executors 3 --executor-cores 1 --executor-memory 2G \
    --conf spark.driver.memory=4G --class my.APP \
    --packages org.apache.spark:spark-core_2.11:2.3.1,org.apache.spark:spark-sql_2.11:2.3.1,org.elasticsearch:elasticsearch-spark-20_2.11:6.2.3,org.apache.spark:spark-mllib_2.11:2.3.1,org.postgresql:postgresql:42.2.4,mysql:mysql-connector-java:8.0.12,org.json4s:json4s-jackson_2.11:3.6.1,org.scalaj:scalaj-http_2.11:2.4.0,org.apache.commons:commons-math3:3.6.1 s3://spark-akshdiu/spark-sandbox_2.11-0.1.jar

它试图运行的行:val sc = new SparkContext(conf)

我也尝试过SparkContext.getOrCreate(conf),但失败了。

这里是个例外:

18/09/10 09:15:06 INFO Client:
     client token: N/A
     diagnostics: User class threw exception: java.lang.IllegalStateException: Promise already completed.
    at scala.concurrent.Promise$class.complete(Promise.scala:55)
    at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:153)
    at scala.concurrent.Promise$class.success(Promise.scala:86)
    at scala.concurrent.impl.Promise$DefaultPromise.success(Promise.scala:153)
    at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$sparkContextInitialized(ApplicationMaster.scala:423)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:843)
    at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:559)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
    at my.APP$.main(APP.scala:279)
    at my.APP.main(APP.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)

     ApplicationMaster host: 10.0.104.106
     ApplicationMaster RPC port: 0
     queue: default
     start time: 1536570864212
     final status: FAILED
     tracking URL: http://ip-10-0-104-106.us-west-2.compute.internal:20888/proxy/application_1536569833967_0006/
     user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1536569833967_0006 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1165)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/09/10 09:15:06 INFO ShutdownHookManager: Shutdown hook called
18/09/10 09:15:06 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-896ebe22-bf2d-41ba-b89c-8c3ba9e7cbd0
18/09/10 09:15:06 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-c99239b0-880f-49ff-9fb0-b848422ff4fe

我在一个主节点上运行它,或者在 m5.xlarge 中运行 1 个主节点 + 2 个从节点,但都失败了。

【问题讨论】:

  • 您是否设法解决了这个问题?我也有同样的问题......
  • 我想我已经解决了,但我忘记了我的解决方案。而且我已经很久没有使用 Spark了。

标签: apache-spark hadoop-yarn amazon-emr


【解决方案1】:

尝试通过替换来运行应用程序

 --deploy-mode client

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-09-28
    • 1970-01-01
    • 1970-01-01
    • 2018-09-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多