作为RDD[(String, Int)] 类型的RDDs,您可以简单地使用join 加入这两个RDDs,您将得到RDD[(String, (Int, Int))]。现在你想List[(String, (Int, Int))]你需要收集加入的RDD(如果加入的RDD很大,不推荐)并将其转换为List。试试下面的代码,
val rdd1: RDD[(String, Int)] = sc.parallelize(List(("aaa", 1), ("bbb", 4), ("ccc", 3)))
val rdd2: RDD[(String, Int)] = sc.parallelize(List(("aaa", 2), ("bbb", 5), ("ddd", 2)))
//simply join two RDDs
val joinedRdd: RDD[(String, (Int, Int))] = rdd1.join(rdd2)
//only if you want List then collect it (It is not recommended for huge RDDs)
val lst: List[(String, (Int, Int))] = joinedRdd.collect().toList
println(lst)
//output
//List((bbb,(4,5)), (aaa,(1,2)))