【发布时间】:2021-08-16 09:45:09
【问题描述】:
我有一个如下所示的数据框,
df.select("col1").show(1,False)
col1
-------------
[[2,1,0,1,,free],[3,1,0,1,4,free]]
另一种展示方式:)
df.select(to_json(struct("col1")))show(1,False)
col1
-----------------
{"col1":[{ "0":"2","1":"1","2":"0","3":"1","5":"free"},{"0":"3","1":"1","2":"0","3":"1","4":"4","5":"free"}]}
现在我想实现下面的数据框,有一个结构要从现有的列中创建,还需要添加新的静态字段'value:zzz'
col1
--------------
{"col1":[{"1":"1","2":"0","3":"1","5":"free","value":"ZZZ","newattrib":{"0":"2"}},{"1":"1","2":"0","3":"1","4":"4","5":"free","value":"ZZZ","newattrib":{"0":"3"}}]}
请向我建议实现这一目标的方法。
【问题讨论】:
-
只看输入/输出,我们应该明白你在做什么?请逻辑解释...
标签: apache-spark pyspark