【发布时间】:2021-03-14 12:58:44
【问题描述】:
我有一个带有 json 字符串的表
UserID json_string
100 [{"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}]
100 [{"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}]
100 [{"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}]
200 [{"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}]
200 [{"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}]
200 [{"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}]
最后,我需要将字符串转换为列:
UserID ID value os_type amount created_at updated_at Type_name
100 77379513 35.4566 null 200 2020-08-16T14:48:27.611-04:00 2020-08-16T14:48:27.611-04:00 same
100 77379514 38.658 null 100 2020-08-16T14:48:27.611-04:00 2020-08-16T14:48:27.611-04:01 niko
100 77379515 40.569 null 150 2020-08-16T14:48:27.611-04:00 2020-08-16T14:48:27.611-04:02 koko
200 77378899 25.365 null 100 2020-09-16T14:48:27.611-04:01 2020-08-17T14:48:27.611-04:03 same
200 77378900 35.898 null 500 2020-09-16T14:48:27.611-04:02 2020-08-17T14:48:27.611-04:04 niko
200 77378901 41.258 null 400 2020-09-16T14:48:27.611-04:03 2020-08-17T14:48:27.611-04:05 koko
首先我尝试从列表中提取 JSON:
SELECT iUserID,json_extract_array(json_string) as json_array
FROM `project.dataset.table`
然后我得到一个这样的表:
UserID json_array
100 {"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}
100 {"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}
100 {"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08-
16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}
200 {"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-09-
16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "same'}
200 {"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-09-
16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "niko'}
200 {"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-09-
16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "koko'}
从这一步开始,我尝试使用函数 JSON_EXTRACT_SCALAR,但我收到一个错误,指出此函数不适用于数组。 那么将数据提取到列的正确方法是什么?
【问题讨论】:
-
出于好奇,为什么要将这些数据存储在 JSON 中?看起来每个条目都有相同的字段。为什么不直接创建一个表,其中包含与这些字段名称相同的真实列?
-
顺便说一句,我想知道为什么由于语法突出显示,行的颜色会交替变化,我注意到您在一个地方使用了
',而不是"。请记住,这些引号字符在 JSON 中不可互换。您必须始终使用"。
标签: sql json google-bigquery