【发布时间】:2021-01-18 20:40:28
【问题描述】:
我有一个 spark 数据框,其值如下所示,我正在努力寻找将输入数据框转换为 Id、Fld1、Fld2 等单独列的方法。感谢任何帮助或指向执行此操作的文档的指针?
val df2 = Seq(
("1", Map("Fld1" -> "USA","Fld2" -> "UK")),
("2", Map("Fld1" -> "Germany", "Fld2" -> "Portugal"))
).toDF("id", "map")
df2.show()
输入:
+---+-----------------------------------+
|id |map |
+---+-----------------------------------+
|1 |[Fld1 -> USA, Fld2 -> UK] |
|2 |[Fld1 -> Germany, Fld2 -> Portugal]|
+---+-----------------------------------+
预期输出:
+---+-------+--------+
| id| Fld1 | Fld2 |
+---+-------+--------+
| 1 | USA | UK |
| 2 |Germany|Portugal|
+---+-------+--------+
【问题讨论】:
-
+---+----------------------------------- |id |地图 | +---+------------------------------------------------+- |1 |[Fld1 -> 美国, Fld2 -> 英国] | |2 |[Fld1 -> 德国,Fld2 -> 葡萄牙]| +---+------------------------------------------------+- +---+--- ----+--------+ |编号| Fld1| Fld2| +---+-------+--------+ | 1|美国|英国| | 2|德国|葡萄牙| +---+-------+--------+
标签: scala apache-spark apache-spark-sql