【发布时间】:2014-07-14 18:41:05
【问题描述】:
我正在使用实时日志数据尝试 graphx,https://snap.stanford.edu/data/soc-LiveJournal1.html。
我有一个由 10 个计算节点组成的集群。每个计算节点有 64G RAM 和 32 个内核。
当我使用 9 个工作节点运行 pagerank 算法时,它比仅使用 1 个工作节点运行它要慢。我怀疑由于某些配置问题,我没有使用所有内存和/或内核。
我浏览了 spark 的配置、调优和编程指南。
我正在使用 spark-shell 运行由
调用的脚本./spark-shell --executor-memory 50g
我让工人和主人运行。当我启动 spark-shell 时,我得到以下日志
14/07/09 17:26:10 INFO Slf4jLogger: Slf4jLogger started
14/07/09 17:26:10 INFO Remoting: Starting remoting
14/07/09 17:26:10 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@node0472.local:60035]
14/07/09 17:26:10 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@node0472.local:60035]
14/07/09 17:26:10 INFO SparkEnv: Registering MapOutputTracker
14/07/09 17:26:10 INFO SparkEnv: Registering BlockManagerMaster
14/07/09 17:26:10 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20140709172610-7f5e
14/07/09 17:26:10 INFO MemoryStore: MemoryStore started with capacity 294.4 MB.
14/07/09 17:26:10 INFO ConnectionManager: Bound socket to port 45700 with id = ConnectionManagerId(node0472.local,45700)
14/07/09 17:26:10 INFO BlockManagerMaster: Trying to register BlockManager
14/07/09 17:26:10 INFO BlockManagerInfo: Registering block manager node0472.local:45700 with 294.4 MB RAM
14/07/09 17:26:10 INFO BlockManagerMaster: Registered BlockManager
14/07/09 17:26:10 INFO HttpServer: Starting HTTP Server
14/07/09 17:26:10 INFO HttpBroadcast: Broadcast server started at http://172.16.104.72:48116
14/07/09 17:26:10 INFO HttpFileServer: HTTP File server directory is /tmp/spark-7b4a7c3c-9fc9-4a64-b2ac-5f328abe9265
14/07/09 17:26:10 INFO HttpServer: Starting HTTP Server
14/07/09 17:26:11 INFO SparkUI: Started SparkUI at http://node0472.local:4040
14/07/09 17:26:12 INFO AppClient$ClientActor: Connecting to master spark://node0472.local:7077...
14/07/09 17:26:12 INFO SparkILoop: Created spark context..
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20140709172612-0007
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/0 on worker-20140709162149-node0476.local-53728 (node0476.local:53728) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/0 on hostPort node0476.local:53728 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/1 on worker-20140709162145-node0475.local-56009 (node0475.local:56009) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/1 on hostPort node0475.local:56009 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/2 on worker-20140709162141-node0474.local-58108 (node0474.local:58108) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/2 on hostPort node0474.local:58108 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/3 on worker-20140709170011-node0480.local-49021 (node0480.local:49021) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/3 on hostPort node0480.local:49021 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/4 on worker-20140709165929-node0479.local-53886 (node0479.local:53886) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/4 on hostPort node0479.local:53886 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/5 on worker-20140709170036-node0481.local-60958 (node0481.local:60958) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/5 on hostPort node0481.local:60958 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/6 on worker-20140709162151-node0477.local-44550 (node0477.local:44550) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/6 on hostPort node0477.local:44550 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/7 on worker-20140709162138-node0473.local-42025 (node0473.local:42025) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/7 on hostPort node0473.local:42025 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/8 on worker-20140709162156-node0478.local-52943 (node0478.local:52943) with 32 cores
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/8 on hostPort node0478.local:52943 with 32 cores, 50.0 GB RAM
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/1 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/0 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/2 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/3 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/6 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/4 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/5 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/8 is now RUNNING
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/7 is now RUNNING
Spark context available as sc.
scala> 14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0479.local:47343/user/Executor#1253632521] with ID 4
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0474.local:39431/user/Executor#1607018658] with ID 2
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0481.local:53722/user/Executor#-1846270627] with ID 5
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0477.local:40185/user/Executor#-111495591] with ID 6
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0473.local:36426/user/Executor#652192289] with ID 7
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0480.local:37230/user/Executor#-1581927012] with ID 3
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0475.local:46363/user/Executor#-182973444] with ID 1
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0476.local:58053/user/Executor#609775393] with ID 0
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@node0478.local:55152/user/Executor#-2126598605] with ID 8
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0474.local:60025 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0473.local:33992 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0481.local:46513 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0477.local:37455 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0475.local:33829 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0479.local:56433 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0480.local:38134 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0476.local:46284 with 28.8 GB RAM
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0478.local:43187 with 28.8 GB RAM
根据日志,我相信我的应用程序是在 worker 上注册的,并且每个 executor 有 50g 的 RAM。现在,我在终端上运行以下 scala 代码来加载数据并计算 pagerank
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val startgraphloading = System.currentTimeMillis;
val graph = GraphLoader.edgeListFile(sc, "filepath").cache();
graph.cache();
val endgraphloading = System.currentTimeMillis;
val startpr1 = System.currentTimeMillis;
val prGraph = graph.staticPageRank(1)
val endpr1 = System.currentTimeMillis;
val startpr2 = System.currentTimeMillis;
val prGraph = graph.staticPageRank(5)
val endpr2 = System.currentTimeMillis;
val loadingt = endgraphloading - startgraphloading;
val firstt = endpr1 - startpr1
val secondt = endpr2 - startpr2
print(loadingt)
print(firstt)
print(secondt)
当我尝试查看每个节点上的内存使用情况时,实际上只使用了 2-3 个计算节点 RAM。这是对的吗?仅 1 名工人比 9 名工人运行得更快。
我使用的是 spark 独立集群模式。配置有问题吗?
提前致谢:)
【问题讨论】:
标签: apache-spark