【发布时间】:2019-11-03 17:50:01
【问题描述】:
我在 YARN 上使用 Spark 和
安巴里2.7.4
HDP 单机版3.1.4
火花2.3.2
Hadoop 3.1.1
Docker 上的石墨 latest
我试图在this tutorial 之后使用 Graphite sink 获取 Spark 指标。
Ambari 中的高级 spark2-metrics-properties 是:
driver.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
executor.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
worker.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
master.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=ap-test-m.c.gcp-ps.internal
*.sink.graphite.port=2003
*.sink.graphite.protocol=tcp
*.sink.graphite.period=10
*.sink.graphite.unit=seconds
*.sink.graphite.prefix=app-test
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource
火花提交:
export HADOOP_CONF_DIR=/usr/hdp/3.1.4.0-315/hadoop/conf/; spark-submit --class com.Main --master yarn --deploy-mode client --driver-memory 1g --executor-memory 10g --num-executors 2 --executor-cores 2 spark-app.jar /data
因此,我只能获得 driver 指标。
另外,我试图将metrics.properties 添加到spark-submit 命令以及全局火花指标道具,但这没有帮助。
最后,我在spark-submit 和java SparkConf 中尝试了conf:
--conf "spark.metrics.conf.driver.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "spark.metrics.conf.executor.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "worker.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "master.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "spark.metrics.conf.*.sink.graphite.host"="host"
--conf "spark.metrics.conf.*.sink.graphite.port"=2003
--conf "spark.metrics.conf.*.sink.graphite.period"=10
--conf "spark.metrics.conf.*.sink.graphite.unit"=seconds
--conf "spark.metrics.conf.*.sink.graphite.prefix"="app-test"
--conf "spark.metrics.conf.*.source.jvm.class"="org.apache.spark.metrics.source.JvmSource"
但这也无济于事。
CSVSink 也仅提供驱动程序指标。
UPD
当我在cluster 模式下提交作业时,我得到的指标与Spark History Server 相同。但是jvm 指标仍然不存在。
【问题讨论】:
标签: apache-spark monitoring hadoop2 metrics graphite