【发布时间】:2021-01-20 07:46:23
【问题描述】:
让我们有表header:
id | col1 | col2
1 | "a" | "b"
2 | "c" | "d"
还有表body:
header_id | body_id | body_col
1 | 6 | "abc"
1 | 7 | "def"
2 | 8 | "ghi"
2 | 9 | "jkl"
我想将body作为结构数组插入到header中,在json中,结果是这样的:
{
id: 1,
col1: "a",
col2: "b",
body: [{body_id: 6, body_col: "abc"}, {body_id: 7, body_col: "def"}]
},
{
id: 2,
col1: "c",
col2: "d",
body: [{body_id: 8, body_col: "ghi"}, {body_id: 9, body_col: "jkl"}]
}
我如何实现这一目标? AFAIK collect_set 或 collect_list 不起作用,因为它们只会将整列收集到一个数组中。
【问题讨论】:
标签: sql apache-spark hive