【问题标题】:Why did Worker kill executor?Worker为什么要杀executor?
【发布时间】:2017-02-21 14:02:39
【问题描述】:

我正在 spark 独立集群中编写 spark 应用程序。当我运行以下代码时,我得到了 ClassNotFoundException(参考屏幕截图)。所以,我关注了worker(192.168.111.202)的日志。

package main

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object mavenTest {
    def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("stream test").setMaster("spark://192.168.111.201:7077")
    val sc = new SparkContext(conf)
    val input = sc.textFile("file:///root/test")

    val words = input.flatMap { line => line.split(" ") }


    val counts = words.map(word => (word, 1)).reduceByKey { case (x, y) => x + y }

    counts.saveAsTextFile("file:///root/mapreduce")
  }
}

以下日志是工人的日志。这些日志说工人杀死执行者,并发生错误。 Worker为什么要杀executor?能给点线索吗?

16/03/24 20:16:48 INFO Worker: Asked to launch executor app-20160324201648-0011/0 for stream test
16/03/24 20:16:48 INFO SecurityManager: Changing view acls to: root
16/03/24 20:16:48 INFO SecurityManager: Changing modify acls to: root
16/03/24 20:16:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/24 20:16:48 INFO ExecutorRunner: Launch command: "/usr/java/jdk1.8.0_73/jre/bin/java" "-cp" "/opt/spark-1.5.2-bin-hadoop2.6/sbin/../conf/:/opt/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar:/opt/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/etc/hadoop" "-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=40243" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "akka.tcp://sparkDriver@192.168.111.201:40243/user/CoarseGrainedScheduler" "--executor-id" "0" "--hostname" "192.168.111.202" "--cores" "1" "--app-id" "app-20160324201648-0011" "--worker-url" "akka.tcp://sparkWorker@192.168.111.202:53363/user/Worker"
16/03/24 20:16:54 INFO Worker: Asked to kill executor app-20160324201648-0011/0
16/03/24 20:16:54 INFO ExecutorRunner: Runner thread for executor app-20160324201648-0011/0 interrupted
16/03/24 20:16:54 INFO ExecutorRunner: Killing process!
16/03/24 20:16:54 ERROR FileAppender: Error writing stream to file /opt/spark-1.5.2-bin-hadoop2.6/work/app-20160324201648-0011/0/stderr
java.io.IOException: Stream closed
        at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:283)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at java.io.FilterInputStream.read(FilterInputStream.java:107)
        at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)
        at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
        at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
        at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
        at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
16/03/24 20:16:54 INFO Worker: Executor app-20160324201648-0011/0 finished with state KILLED exitStatus 143
16/03/24 20:16:54 INFO Worker: Cleaning up local directories for application app-20160324201648-0011
16/03/24 20:16:54 INFO ExternalShuffleBlockResolver: Application app-20160324201648-0011 removed, cleanupLocalDirs = true

【问题讨论】:

  • ClassNotFoundException 的完整行是什么?
  • 它是“16/03/25 03:03:32 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.111.202): java.lang.ClassNotFoundException: main.mapreduce$$ anonfun$2"

标签: eclipse scala apache-spark


【解决方案1】:

我发现这是内存问题,但我不知道为什么会出现这个问题。只需在 yarn-site.xml 文件中添加以下属性。 Apache hadoop 说这个配置决定了是否对容器实施虚拟内存限制。

<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>

【讨论】:

    【解决方案2】:

    你的火花版本是什么?这是 spark 的已知错误,已在 1.6 版中修复。 更多细节你可以看到[SPARK-9844]

    【讨论】:

    • 我用的是 1.5.2。几乎忘记了这个问题。无论如何,谢谢!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-08-05
    • 2014-05-24
    • 2020-10-15
    • 1970-01-01
    • 2021-07-30
    • 2014-11-01
    相关资源
    最近更新 更多