【问题标题】:Create a table in AWS athena parsing dynamic keys in nested json在 AWS athena 中创建一个表,解析嵌套 json 中的动态键
【发布时间】:2020-06-25 14:10:14
【问题描述】:
我有下面格式的每一行 JSON 文件,我想解析这些数据并使用 AWS Athena 将其索引到一个表中。
{
"123": {
"abc": {
"id": "test",
"data": "ipsum lorum"
},
"abd": {
"id": "test_new",
"data": "lorum ipsum"
}
}
}
可以为上述数据创建这种格式的表格吗?文档中提到struct可以用于解析嵌套的JSON,但是没有动态键的示例。
【问题讨论】:
标签:
amazon-web-services
presto
amazon-athena
【解决方案1】:
您可以将JSON 转换为映射或数组,并以您想要的任何方式对其进行转换。在这种情况下,您可以使用 map_values 和 CROSS JOIN UNNEST 从 JSON 对象生成行:
with test AS
(SELECT '{ "123": { "abc": { "id": "test", "data": "ipsum lorum" }, "abd": { "id": "test_new", "data": "lorum ipsum" } } }' AS str),
struct_like AS
(SELECT cast(json_parse(str) AS map<varchar,
map<varchar,
map<varchar,
varchar>>>) AS m
FROM test),
flat AS
(SELECT item
FROM struct_like
CROSS JOIN UNNEST(map_values(m)) AS t(item))
SELECT
key,
value['id'] AS id,
value['data'] AS data
FROM flat
CROSS JOIN unnest(item) AS t(key, value)
结果:
key id data
abc test ipsum lorum
abd test_new lorum ipsum