【问题标题】:BigQuery - reducing to unique records within a fieldBigQuery - 减少到字段中的唯一记录
【发布时间】:2019-01-28 09:53:59
【问题描述】:

我有一个包含这样字段的表格:

ID    Field 1           Field 2
1     22,34,05,44,44    01,02,02,03
2     11,01,05          02,02,01,01,22

如何在 BigQuery (strandardSQL) 中将其转换为仅显示唯一记录并从大到小排序?

所以输出看起来像这样:

ID    Field 1           Field 2
1     05,22,34,44       01,02,03
2     01,05,11          01,02,22

我尝试使用Split,但随后我运行了数百个重复项,而且window 函数也不允许distinct 稍后将它们组合在一起。

请帮忙解决

【问题讨论】:

    标签: google-bigquery unique distinct


    【解决方案1】:

    您可以将字符串拆分为数组,然后使用DISTINCT 进行去重并使用ORDER BY 进行排序:

    SELECT
      ID,
      ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field1, ',')) AS x ORDER BY x) AS field1,
      ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field2, ',')) AS x ORDER BY x) AS field2
    FROM `project-name`.dataset.table
    

    如果你想再次将数组变成逗号分隔的字符串,可以使用ARRAY_TO_STRING函数:

    SELECT
      ID,
      ARRAY_TO_STRING(ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field1, ',')) AS x ORDER BY x), ',') AS field1,
      ARRAY_TO_STRING(ARRAY(SELECT DISTINCT x FROM UNNEST(SPLIT(field2, ',')) AS x ORDER BY x), ',') AS field2
    FROM `project-name`.dataset.table
    

    【讨论】:

      猜你喜欢
      • 2016-12-15
      • 1970-01-01
      • 1970-01-01
      • 2015-09-09
      • 2021-10-09
      • 1970-01-01
      • 2012-09-07
      • 2021-01-16
      • 2016-08-09
      相关资源
      最近更新 更多