【发布时间】:2020-01-20 06:08:31
【问题描述】:
注意:以下问题的答案中给出的建议无效 value toDF is not a member of org.apache.spark.rdd.RDD value toDF is not a member of org.apache.spark.rdd.RDD[Weather]
我正在尝试编写一个通用函数,它只保留给定数据集中每个键的前 k 个值:
下面是代码:
def topKReduceByKey[K:ClassTag,V:Ordering](ds: Dataset[(K, V)], k: Int): Dataset[(K, V)] = {
import sqlContext.implicits._
ds
.rdd
.map(tuple => (tuple._1, Seq(tuple._2)))
.reduceByKey((x, y) => (x ++ y).sorted(Ordering[V].reverse).take(k))
.flatMap(tuple => tuple._2.map(v => (tuple._1, v)))
.toDF("key", "value")
.as[(K, V)]
}
在运行时,我收到以下错误消息:
Error:(43, 8) value toDF is not a member of org.apache.spark.rdd.RDD[(K, V)]
possible cause: maybe a semicolon is missing before `value toDF'?
.toDF("key", "value")
谁能帮我理解这里出了什么问题?
【问题讨论】:
标签: scala apache-spark