【发布时间】:2020-04-01 07:31:21
【问题描述】:
使用Below DataFrame我得到一个Json数组,但数据类型是字符串,我正在寻找帮助将这个字符串转换为JSON数组。
val rawDF = spark.sql("select 1").withColumn("parent_id", lit("Parent_12345")).withColumn("jsonString", lit("""[{"First":{"Info":"ABCD123","Res":"5.2"}},{"Second":{"Info":"ABCD123","Res":"5.2"}},{"Third":{"Info":"ABCD123","Res":"5.2"}}]"""))
rawDF.show(false)
输入输出数据框:
Input DataFrame :
+----------+-------+-----------------------------------------------------------------------------------------------------------------------------------+
|item_id |s_tag |jsonString |
+----------+-------+-----------------------------------------------------------------------------------------------------------------------------------+
|Item_12345|S_12345|[{"First":{"Info":"ABCD123","Res":"5.2"}},{"Second":{"Info":"ABCD123","Res":"5.2"}},{"Third":{"Info":"ABCD123","Res":"5.2"}}] |
+----------+-------+-----------------------------------------------------------------------------------------------------------------------------------+
Output DataFrame :
+----------+-------+-----------------------------------------+
|item_id |s_tag |jsonString |
+----------+-------+-----------------------------------------+
|Item_12345|S_12345|{"First":{"Info":"ABCD123","Res":"5.2"}} |
+----------+-------+-----------------------------------------+
|Item_12345|S_12345|{"Second":{"Info":"ABCD123","Res":"5.2"}}|
+----------+-------+-----------------------------------------+
|Item_12345|S_12345|{"Third":{"Info":"ABCD123","Res":"5.2"}} |
+----------+-------+-----------------------------------------+
问题陈述:
jsonString 是字符串数据,但看起来像 json 数组,我想将此列转换/转换为 Json 数组以拆分为可能的行数
作为输出数据帧。
到目前为止我已经尝试过什么:
val jsonArray = udf((value: String) => new JSONArray(value)) // Or how to convert as Array of json.
val strToJsonArray = rawDF.withColumn("arrJson", jsonArray(rawDF("jsonString"))).drop("jsonString") //This is not working.
//If We can convert To Array then using below code I can Split the Json Column in expected Output.
val splittedDF = strToJsonArray.withColumn("splittedJson", explode(strToJsonArray.col("arrJson"))).drop("arrJson")
如何将我的字符串转换为 JSON 值数组?
【问题讨论】:
标签: arrays json scala apache-spark explode