【发布时间】:2019-09-03 23:47:39
【问题描述】:
我正在通过 spark-sql 使用 spark-Cassandra 驱动程序来查询我的 Cassandra 集群。每个 Cassandra 节点都有一个 spark worker(位于同一位置)。
问题:在向执行程序提交任务之前有很长的延迟(基于 Web UI 上的时间戳和驱动程序日志)。该查询是一个简单的选择,它指定所有 cassandra 分区键并包含两个阶段和两个任务。以前,查询在另一台具有共置驱动程序和主服务器的服务器上需要 300 毫秒。
但我必须将我的应用程序和 spark master 移动到另一台服务器(与以前相同,但只是在另一台物理服务器上),现在查询需要 40 秒。虽然任务持续时间约为 7 秒,但 Job 花了 40 秒,我不知道额外的延迟是为了什么?
我还用与 Cassandra 无关的工作检查了 spark,它花了 200 毫秒,所以我认为它与 spark-cassandra 的关系比火花本身更重要。
这是作业执行期间的火花日志:
[INFO ] 2019-03-04 06:59:07.067 [qtp1151421920-470] SparkSqlParser 54 - Parsing command: select * from ...
[INFO ] 2019-03-04 06:59:07.276 [qtp1151421920-470] CassandraSourceRelation 35 - Input Predicates: ...
[INFO ] 2019-03-04 06:59:07.279 [qtp1151421920-470] ClockFactory 52 - Using native clock to generate timestamps.
[INFO ] 2019-03-04 06:59:07.439 [qtp1151421920-470] Cluster 1543 - New Cassandra host /192.168.1.201:9042 added
[INFO ] 2019-03-04 06:59:07.440 [qtp1151421920-470] Cluster 1543 - New Cassandra host /192.168.1.202:9042 added
[INFO ] 2019-03-04 06:59:07.440 [qtp1151421920-470] Cluster 1543 - New Cassandra host /192.168.1.203:9042 added
[INFO ] 2019-03-04 06:59:07.440 [qtp1151421920-470] Cluster 1543 - New Cassandra host /192.168.1.204:9042 added
[INFO ] 2019-03-04 06:59:07.446 [qtp1151421920-470] CassandraConnector 35 - Connected to Cassandra cluster: Digger Cluster
[INFO ] 2019-03-04 06:59:07.526 [qtp1151421920-470] CassandraSourceRelation 35 - Input Predicates: ...
[INFO ] 2019-03-04 06:59:07.848 [qtp1151421920-470] CodeGenerator 54 - Code generated in 120.31952 ms
[INFO ] 2019-03-04 06:59:08.264 [qtp1151421920-470] CodeGenerator 54 - Code generated in 15.084165 ms
[INFO ] 2019-03-04 06:59:08.289 [qtp1151421920-470] CodeGenerator 54 - Code generated in 17.893182 ms
[INFO ] 2019-03-04 06:59:08.379 [qtp1151421920-470] SparkContext 54 - Starting job: collectAsList at MyClass.java:5
[INFO ] 2019-03-04 06:59:08.394 [dag-scheduler-event-loop] DAGScheduler 54 - Registering RDD 12 (toJSON at MyClass.java.java:5)
[INFO ] 2019-03-04 06:59:08.397 [dag-scheduler-event-loop] DAGScheduler 54 - Got job 0 (collectAsList at MyClass.java.java:5) with 1 output partitions
[INFO ] 2019-03-04 06:59:08.398 [dag-scheduler-event-loop] DAGScheduler 54 - Final stage: ResultStage 1 (collectAsList at MyClass.java.java:5)
[INFO ] 2019-03-04 06:59:08.398 [dag-scheduler-event-loop] DAGScheduler 54 - Parents of final stage: List(ShuffleMapStage 0)
[INFO ] 2019-03-04 06:59:08.400 [dag-scheduler-event-loop] DAGScheduler 54 - Missing parents: List(ShuffleMapStage 0)
[INFO ] 2019-03-04 06:59:08.405 [dag-scheduler-event-loop] DAGScheduler 54 - Submitting ShuffleMapStage 0 (MapPartitionsRDD[12] at toJSON at MyClass.java.java:5), which has no missing parents
[INFO ] 2019-03-04 06:59:15.703 [pool-44-thread-1] CassandraConnector 35 - Disconnected from Cassandra cluster: Digger Cluster
-----------------long delay here
[INFO ] 2019-03-04 06:59:43.547 [dag-scheduler-event-loop] MemoryStore 54 - Block broadcast_0 stored as values in memory (estimated size 20.6 KB, free 17.8 GB)
[INFO ] 2019-03-04 06:59:43.579 [dag-scheduler-event-loop] MemoryStore 54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 9.5 KB, free 17.8 GB)
[INFO ] 2019-03-04 06:59:43.581 [dispatcher-event-loop-1] BlockManagerInfo 54 - Added broadcast_0_piece0 in memory on 192.168.1.94:38311 (size: 9.5 KB, free: 17.8 GB)
[INFO ] 2019-03-04 06:59:43.584 [dag-scheduler-event-loop] SparkContext 54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1006
[INFO ] 2019-03-04 06:59:43.597 [dag-scheduler-event-loop] DAGScheduler 54 - Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[12] at toJSON at MyClass.java.java:5) (first 15 tasks are for partitions Vector(0))
[INFO ] 2019-03-04 06:59:43.598 [dag-scheduler-event-loop] TaskSchedulerImpl 54 - Adding task set 0.0 with 1 tasks
[INFO ] 2019-03-04 06:59:43.619 [dag-scheduler-event-loop] FairSchedulableBuilder 54 - Added task set TaskSet_0.0 tasks to pool rest
[INFO ] 2019-03-04 06:59:43.652 [dispatcher-event-loop-35] TaskSetManager 54 - Starting task 0.0 in stage 0.0 (TID 0, 192.168.1.210, executor 11, partition 0, NODE_LOCAL, 6357 bytes)
[INFO ] 2019-03-04 06:59:43.920 [dispatcher-event-loop-36] BlockManagerInfo 54 - Added broadcast_0_piece0 in memory on 192.168.1.210:42612 (size: 9.5 KB, free: 912.3 MB)
[INFO ] 2019-03-04 06:59:46.591 [task-result-getter-0] TaskSetManager 54 - Finished task 0.0 in stage 0.0 (TID 0) in 2963 ms on 192.168.1.210 (executor 11) (1/1)
[INFO ] 2019-03-04 06:59:46.594 [task-result-getter-0] TaskSchedulerImpl 54 - Removed TaskSet 0.0, whose tasks have all completed, from pool rest
[INFO ] 2019-03-04 06:59:46.601 [dag-scheduler-event-loop] DAGScheduler 54 - ShuffleMapStage 0 (toJSON at MyClass.java.java:5) finished in 2.981 s
[INFO ] 2019-03-04 06:59:46.602 [dag-scheduler-event-loop] DAGScheduler 54 - looking for newly runnable stages
[INFO ] 2019-03-04 06:59:46.603 [dag-scheduler-event-loop] DAGScheduler 54 - running: Set()
[INFO ] 2019-03-04 06:59:46.603 [dag-scheduler-event-loop] DAGScheduler 54 - waiting: Set(ResultStage 1)
[INFO ] 2019-03-04 06:59:46.604 [dag-scheduler-event-loop] DAGScheduler 54 - failed: Set()
[INFO ] 2019-03-04 06:59:46.608 [dag-scheduler-event-loop] DAGScheduler 54 - Submitting ResultStage 1 (MapPartitionsRDD[18] at collectAsList at MyClass.java.java:5), which has no missing parents
[INFO ] 2019-03-04 06:59:46.615 [dag-scheduler-event-loop] MemoryStore 54 - Block broadcast_1 stored as values in memory (estimated size 20.8 KB, free 17.8 GB)
[INFO ] 2019-03-04 06:59:46.618 [dag-scheduler-event-loop] MemoryStore 54 - Block broadcast_1_piece0 stored as bytes in memory (estimated size 9.8 KB, free 17.8 GB)
[INFO ] 2019-03-04 06:59:46.619 [dispatcher-event-loop-21] BlockManagerInfo 54 - Added broadcast_1_piece0 in memory on 192.168.1.94:38311 (size: 9.8 KB, free: 17.8 GB)
[INFO ] 2019-03-04 06:59:46.620 [dag-scheduler-event-loop] SparkContext 54 - Created broadcast 1 from broadcast at DAGScheduler.scala:1006
[INFO ] 2019-03-04 06:59:46.622 [dag-scheduler-event-loop] DAGScheduler 54 - Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[18] at collectAsList at MyClass.java.java:5) (first 15 tasks are for partitions Vector(0))
[INFO ] 2019-03-04 06:59:46.622 [dag-scheduler-event-loop] TaskSchedulerImpl 54 - Adding task set 1.0 with 1 tasks
[INFO ] 2019-03-04 06:59:46.622 [dag-scheduler-event-loop] FairSchedulableBuilder 54 - Added task set TaskSet_1.0 tasks to pool rest
[INFO ] 2019-03-04 06:59:46.627 [dispatcher-event-loop-25] TaskSetManager 54 - Starting task 0.0 in stage 1.0 (TID 1, 192.168.1.212, executor 9, partition 0, PROCESS_LOCAL, 4730 bytes)
[INFO ] 2019-03-04 06:59:46.851 [dispatcher-event-loop-9] BlockManagerInfo 54 - Added broadcast_1_piece0 in memory on 192.168.1.212:43471 (size: 9.8 KB, free: 912.3 MB)
[INFO ] 2019-03-04 06:59:47.257 [dispatcher-event-loop-38] MapOutputTrackerMasterEndpoint 54 - Asked to send map output locations for shuffle 0 to 192.168.1.212:46794
[INFO ] 2019-03-04 06:59:47.262 [map-output-dispatcher-0] MapOutputTrackerMaster 54 - Size of output statuses for shuffle 0 is 141 bytes
[INFO ] 2019-03-04 06:59:47.763 [task-result-getter-1] TaskSetManager 54 - Finished task 0.0 in stage 1.0 (TID 1) in 1140 ms on 192.168.1.212 (executor 9) (1/1)
[INFO ] 2019-03-04 06:59:47.763 [task-result-getter-1] TaskSchedulerImpl 54 - Removed TaskSet 1.0, whose tasks have all completed, from pool rest
[INFO ] 2019-03-04 06:59:47.765 [dag-scheduler-event-loop] DAGScheduler 54 - ResultStage 1 (collectAsList at MyClass.java.java:5) finished in 1.142 s
[INFO ] 2019-03-04 06:59:47.771 [qtp1151421920-470] DAGScheduler 54 - Job 0 finished: collectAsList at MyClass.java.java:5, took 39.391066 s
[INFO ] 2019-03-04 07:00:09.014 [Spark Context Cleaner] ContextCleaner 54 - Cleaned accumulator 4
[INFO ] 2019-03-04 07:00:09.015 [Spark Context Cleaner] ContextCleaner 54 - Cleaned accumulator 0
[INFO ] 2019-03-04 07:00:09.015 [Spark Context Cleaner] ContextCleaner 54 - Cleaned accumulator 3
[INFO ] 2019-03-04 07:00:09.015 [Spark Context Cleaner] ContextCleaner 54 - Cleaned accumulator 1
[INFO ] 2019-03-04 07:00:09.028 [dispatcher-event-loop-10] BlockManagerInfo 54 - Removed broadcast_1_piece0 on 192.168.1.94:38311 in memory (size: 9.8 KB, free: 17.8 GB)
[INFO ] 2019-03-04 07:00:09.045 [dispatcher-event-loop-0] BlockManagerInfo 54 - Removed broadcast_1_piece0 on 192.168.1.212:43471 in memory (size: 9.8 KB, free: 912.3 MB)
[INFO ] 2019-03-04 07:00:09.063 [Spark Context Cleaner] ContextCleaner 54 - Cleaned shuffle 0
[INFO ] 2019-03-04 07:00:09.065 [dispatcher-event-loop-16] BlockManagerInfo 54 - Removed broadcast_0_piece0 on 192.168.1.94:38311 in memory (size: 9.5 KB, free: 17.8 GB)
[INFO ] 2019-03-04 07:00:09.071 [dispatcher-event-loop-37] BlockManagerInfo 54 - Removed broadcast_0_piece0 on 192.168.1.210:42612 in memory (size: 9.5 KB, free: 912.3 MB)
[INFO ] 2019-03-04 07:00:09.074 [Spark Context Cleaner] ContextCleaner 54 - Cleaned accumulator 2
还附上了屏幕截图,以激发作业及其任务的 web ui。日志和图像不适用于同一个作业。
P.S:每次我运行查询时,spark-cassandra 连接器是否都会创建一个新会话(我每次都看到连接断开到 cassandra 集群)?我并行运行许多查询,这不会比纯 cassandra 慢得多吗?
【问题讨论】:
-
connector、spark 和 cassandra 有哪些版本?
-
cassandra:3.11.2,火花:2.2.1,connector_2.11 2.0.7
标签: apache-spark cassandra spark-cassandra-connector