【问题标题】:Querying tables in BigQuery在 BigQuery 中查询表
【发布时间】:2016-12-25 13:02:32
【问题描述】:

背景

我有一个包含 1 列“数据”的表,其中包含 BigQuery 中的“JSON”,如下所示。

 data    
 {"name":"x","mobile":999,"location":"abc"}
 {"name":"x1","mobile":9991,"location":"abc1"}

现在,我想使用 groupby 函数:

SELECT
    data
FROM
    table 
GROUP BY 
    json_extract(data,'$.location')

此查询引发错误

GROUP BY 中的表达式 JSON_EXTRACT([data], '$.location') 无效

所以,我将查询修改为

SELECT
    data, json_extract(data,'$.location') as l
FROM
    table 
GROUP BY
    l

这个查询抛出错误

GROUP BY 列表中不存在表达式“数据”

查询

如何在 group by 子句中使用 JSON 字段?

以及使用 JSON 填充列的限制是什么(在查询的上下文中)。

【问题讨论】:

  • 尚不清楚 - 您期望得到什么结果?你能在输出中提供“聚合”行的例子吗?

标签: mysql sql json group-by google-bigquery


【解决方案1】:

您正在按位置对某些内容进行分组,但您没有对 data 字段使用聚合函数,因此编译器不知道要选择哪个或您在源上聚合什么。

只是为了说明我编译这个测试查询的例子,它使用group_concat

select group_concat(data),location from
(
select * from
(SELECT '{"name":"x","mobile":999,"location":"abc"}' as data,json_extract('{"name":"x","mobile":999,"location":"abc"}','$.location') as location),
(SELECT '{"name":"x","mobile":111,"location":"abc"}' as data,json_extract('{"name":"x","mobile":111,"location":"abc"}','$.location') as location),
(SELECT '{"name":"x1","mobile":9991,"location":"abc1"}' as data,json_extract('{"name":"x1","mobile":9991,"location":"abc1"}','$.location') as location)

) d
group by location

然后返回:

+-----+---------------------------------------------------------------------------------------------------+----------+--+
| Row | f0_                                                                                               | location |  |
+-----+---------------------------------------------------------------------------------------------------+----------+--+
| 1   | {"name":"x","mobile":999,"location":"abc"},"{""name"":""x"",""mobile"":111,""location"":""abc""}" | abc      |  |
+-----+---------------------------------------------------------------------------------------------------+----------+--+
| 2   | {"name":"x1","mobile":9991,"location":"abc1"}                                                     | abc1     |  |
+-----+---------------------------------------------------------------------------------------------------+----------+--+

BigQuery's Aggregate Functions documented here

【讨论】:

  • 好的,很好,但是如果我有大量行,这种方法可以扩展吗?
  • 如果您说的是 BigQuery,是的,甚至 PB 级规模的操作运行速度也很快
  • 内部选择是在没有表的情况下运行静态查询的纯示例。但是你的数据在一个表中,所以你只查询那个。
【解决方案2】:

下面试试

SELECT location,
  GROUP_CONCAT_UNQUOTED(REPLACE(data, ',"location":"' + location + '"', '')) AS data
FROM (
  SELECT data,
    JSON_EXTRACT_SCALAR(data,'$.location') AS  location,
  FROM YourTable
)
GROUP BY location  

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多