【问题标题】:Spark Scala: convert columns into ListSpark Scala:将列转换为列表
【发布时间】:2020-02-09 15:51:14
【问题描述】:

我有以下数据结构表示列名称(第一列)和它的值 - 类似这样:

|col1       |col2            |col3       |columnname   |
+-----------+----------------+-----------+-------------+
|Very High  |High            |Medium     |predchurnrisk|
|Active     |Lapsed          |Renew      |userstatus   |
|Very High  |High            |Medium     |predinmarket |
|High flyers|Watching Pennies|Big pockets|predsegmentid|
|Male       |Female          |Others     |usergender   |
+-----------+----------------+-----------+-------------+

我想要 Array[(String, List[String])] 类型的变量 domainvalues

[predchurnrisk,(Very High, High, Medium)]
[userstatus,(Active, Lapsed, Renew)]
.

如何用 map 或 foreach 做到这一点?

【问题讨论】:

标签: scala apache-spark


【解决方案1】:

作为开始:

val df = sc.parallelize(Seq(("Very High","High","Medium","predchurnrisk"),("Active","Lapsed","Renew","userstatus"))).toDF("col1","col2","col3","columnname")
import org.apache.spark.sql.functions._
import spark.implicits._
df.withColumn("arr", array("col1", "col2", "col3")).drop("col1","col2","col3").show

这会打印附件

也许你可以从这里拿走它,干杯!

【讨论】:

    猜你喜欢
    • 2017-06-11
    • 2019-02-20
    • 2018-03-31
    • 2015-12-05
    • 2016-12-01
    • 1970-01-01
    • 1970-01-01
    • 2016-01-03
    • 2020-11-06
    相关资源
    最近更新 更多