【发布时间】:2018-08-06 19:41:47
【问题描述】:
我有一个低于 Cols 的数据集。
df.show();
+--------+---------+---------+---------+---------+
| Col1 | Col2 | Expend1 | Expend2 | Expend3 |
+--------+---------+---------+---------+---------+
| Value1 | Cvalue1 | 123 | 2254 | 22 |
| Value1 | Cvalue2 | 124 | 2255 | 23 |
+--------+---------+---------+---------+---------+
我希望使用一些连接或多维数据集或任何操作将其更改为以下格式。
1.
+--------+---------+------+
| Value1 | Cvalue1 | 123 |
| Value1 | Cvalue1 | 2254 |
| Value1 | Cvalue1 | 22 |
| Value1 | Cvalue1 | 124 |
| Value1 | Cvalue1 | 2255 |
| Value1 | Cvalue1 | 23 |
+--------+---------+------+
如果这种格式更好
2.
+--------+---------+---------+------+
| Value1 | Cvalue1 | Expend1 | 123 |
| Value1 | Cvalue1 | Expend2 | 2254 |
| Value1 | Cvalue1 | Expend3 | 22 |
| Value1 | Cvalue1 | Expend1 | 124 |
| Value1 | Cvalue1 | Expend2 | 2255 |
| Value1 | Cvalue1 | Expend3 | 23 |
+--------+---------+---------+------+
我能不能实现以上两种可能的格式。如果在 #1 的情况下,我可以得到 Last value 的列名,无论是 Expend1 还是 Expend 2 或 Expend3。
【问题讨论】:
标签: apache-spark dataframe apache-spark-sql apache-spark-dataset