【发布时间】:2019-10-06 03:29:22
【问题描述】:
官方的解释是 maxIterations 将用于非收敛算法。 我的问题是:如果我不知道我的算法的收敛性,我应该如何设置 maxIterations 的值? 而且,如果有收敛算法,那么这个值是什么意思?
顺便说一句,我也对 pregel 的“迭代”感到困惑。 代码执行如何算作一次迭代?
这是部分预凝胶源代码:
// Loop
var prevG: Graph[VD, ED] = null
var i = 0
while (activeMessages > 0 && i < maxIterations) {
// Receive the messages and update the vertices.
prevG = g
g = g.joinVertices(messages)(vprog)
graphCheckpointer.update(g)
val oldMessages = messages
// Send new messages, skipping edges where neither side received a message. We must cache
// messages so it can be materialized on the next line, allowing us to uncache the previous
// iteration.
messages = GraphXUtils.mapReduceTriplets(
g, sendMsg, mergeMsg, Some((oldMessages, activeDirection)))
// The call to count() materializes `messages` and the vertices of `g`. This hides oldMessages
// (depended on by the vertices of g) and the vertices of prevG (depended on by oldMessages
// and the vertices of g).
messageCheckpointer.update(messages.asInstanceOf[RDD[(VertexId, A)]])
activeMessages = messages.count()
logInfo("Pregel finished iteration " + i)
// Unpersist the RDDs hidden by newly-materialized RDDs
oldMessages.unpersist(blocking = false)
prevG.unpersistVertices(blocking = false)
prevG.edges.unpersist(blocking = false)
// count the iteration
i += 1
}
感谢您的慷慨回答:)
【问题讨论】:
标签: apache-spark iteration spark-graphx