【问题标题】:Bigquery: Transform data in multiple columns into row-formatBigquery:将多列中的数据转换为行格式
【发布时间】:2019-07-19 07:30:18
【问题描述】:

假设BQ中有如下表格:

SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7

因此,表格可以大到大约 100 列。

我想转换这个查询,这样我就有了结果表:

SELECT "Desktop" AS Device, 24 AS Nr UNION ALL
SELECT "Desktop" AS Device, 9 AS Nr UNION ALL
SELECT "Desktop" AS Device, 28 AS Nr UNION ALL
SELECT "Desktop" AS Device, 7 AS Nr UNION ALL
SELECT "Desktop" AS Device, 98 AS Nr UNION ALL
SELECT "Desktop" AS Device, 77 AS Nr UNION ALL
SELECT "Desktop" AS Device, 59 AS Nr UNION ALL
SELECT "Mobile" AS Device, 8 AS Nr UNION ALL
SELECT "Mobile" AS Device, 43 AS Nr UNION ALL
SELECT "Mobile" AS Device, 75 AS Nr UNION ALL
Etc

有人知道如何实现吗?

【问题讨论】:

    标签: google-analytics google-bigquery


    【解决方案1】:

    以下是 BigQuery 标准 SQL,这里的额外奢侈是它不依赖于要取消透视的列的数量和名称

    #standardSQL
    WITH raw AS (
      SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
      SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
      SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7
    )
    SELECT Device, Nr FROM raw t, 
    UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(Device) FROM UNNEST([t]))), r'":([^,}]*)')) Nr 
    

    更新 OP 的评论:我完全忘记在要求中包含列名也应该作为单独的列添加

    #standardSQL
    SELECT Device, SPLIT(pair, ':')[OFFSET(0)] AS col, SPLIT(pair, ':')[OFFSET(1)] AS Nr 
    FROM raw t, 
    UNNEST(SPLIT(REGEXP_REPLACE(TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(Device) FROM UNNEST([t]))), r'["{}]', ''))) pair  
    

    如果现在应用于相同的采样数据结果如下所示

    Row Device  col     Nr   
    1   Desktop col1    24   
    2   Desktop col2    9    
    3   Desktop col3    28   
    4   Desktop col4    7    
    5   Desktop col5    98   
    6   Desktop col6    77   
    7   Desktop col7    59   
    8   Mobile  col1    8    
    9   Mobile  col2    43   
    10  Mobile  col3    75   
    11  Mobile  col4    44   
    12  Mobile  col5    38   
    13  Mobile  col6    31   
    14  Mobile  col7    46   
    15  Tablet  col1    7    
    16  Tablet  col2    9    
    17  Tablet  col3    34   
    18  Tablet  col4    86   
    19  Tablet  col5    62   
    20  Tablet  col6    69   
    21  Tablet  col7    74   
    

    【讨论】:

    • Thnx Mikhail,在很多列的情况下非常方便。但是,我完全忘记了列名也应该作为单独的列添加的要求,所以我知道例如值 24 与第一行的“col1”匹配。这也可能吗?
    • 那么你为什么接受之前的答案呢?无论如何发布您的新问题或用您真正需要的任何内容更新这个问题,我将分别回答或更新我的答案。同时考虑至少投票
    • 无论如何 - 请在我的回答中查看更新,请不要忘记投票
    • 米哈伊尔,为了回答你的问题,我接受了之前的答案,因为它符合我最初的要求。我的额外要求是后来才出现的,我第一次回复你的代码。我会考虑你的反馈,下次我会发布一个新的帖子。无论如何,感谢更新的代码,这完美无缺。
    • 确定,没问题,明白
    【解决方案2】:

    您可以将数字列转换为 ARRAY 并使用 UNNEST:

    with raw as (
    SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
    SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
    SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7
    )
    select Device,  Nr
    from raw
    left join UNNEST ([col1, col2, col3,col4,col5,col6,col7]) Nr
    

    【讨论】:

      猜你喜欢
      • 2023-03-03
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-12-01
      • 2014-01-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多