【问题标题】:Splitting value in a column into all columns of a row in BigQuery将列中的值拆分为 BigQuery 中一行的所有列
【发布时间】:2017-03-14 21:47:37
【问题描述】:

我有一个包含 7 列的表格,如下所示:

date                         org   cus_id   prod_id   sales_qty   sales_amount   profit_amount
30-AUG-14 55 12 34 56 78 99  null   null      null     null           null        null
31-AUG-14 22 32 43 65 76 88  null   null      null     null           null        null

确实,第一列中的值由每一行中所有列的值连接。我想通过将第一列中的值拆分到所有列来修复它。预期的输出应该如下

date        org   cus_id   prod_id   sales_qty   sales_amount   profit_amount
 30-AUG-14   55       12        34          56             78              99  
 31-AUG-14   22       32        43          65             76              88  

我认为拆分这样的字符串值是适用的,但我不熟悉拆分它并放入现有列中。我可以有你的建议吗?提前谢谢你。

【问题讨论】:

    标签: sql google-bigquery


    【解决方案1】:

    您可以使用用户定义函数将值扩展到现有或新列。

    #standardSQL
    CREATE TEMPORARY FUNCTION AddField(s STRUCT<tdate STRING, org INT64,cus_id INT64,prod_id INT64,sales_qty INT64,sales_amount INT64,profit_amount INT64>)
      RETURNS STRUCT<tdate STRING, org INT64,cus_id INT64,prod_id INT64,sales_qty INT64,sales_amount INT64,profit_amount INT64> LANGUAGE js AS """
    var fields = s.tdate.split(' ');
      s.org=fields[1];
      s.cus_id=fields[2];
      s.prod_id=fields[3];
      s.sales_qty=fields[4];
      s.sales_amount=fields[5];
      s.profit_amount=fields[6];
      return s;
    """;
    with mytable as (
    select 
    "30-AUG-14 55 12 34 56 78 99" as tdate,  null as org,   null as cus_id,    null as prod_id , null as sales_qty ,null as  sales_amount ,null as  profit_amount
    union all
    select "31-AUG-14 22 32 43 65 76 88" as tdate,  null as org,   null as cus_id,    null as prod_id , null as sales_qty ,null as  sales_amount ,null as  profit_amount
    )
    SELECT AddField(t).*
    FROM mytable AS t;
    

    要使用标准 SQL 将行值传递给 JavaScript 函数,请定义一个函数,该函数采用与表相同的行类型的结构。

    例如:

    s STRUCT<tdate STRING, org INT64,cus_id INT64,prod_id INT64,sales_qty INT64,sales_amount INT64,profit_amount INT64>
    

    然后使用 Javascript 代码转换您现有的值

     var fields = s.tdate.split(' ');
     s.org=fields[1];
    

    您可以添加逻辑,如果存在则不要覆盖,或创建为新列,然后对整行运行这样的查询

    SELECT AddField(t).*
    FROM mytable AS t;
    

    您可以在migration guideUDF docs 中找到多个复杂的UDF。

    【讨论】:

      【解决方案2】:

      下面试试

      #standardSQL
      SELECT 
        SPLIT(date, ' ')[OFFSET(0)] AS date,        
        SPLIT(date, ' ')[OFFSET(1)] AS org,   
        SPLIT(date, ' ')[OFFSET(2)] AS cus_id,   
        SPLIT(date, ' ')[OFFSET(3)] AS prod_id,   
        SPLIT(date, ' ')[OFFSET(4)] AS sales_qty,   
        SPLIT(date, ' ')[OFFSET(5)] AS sales_amount,   
        SPLIT(date, ' ')[OFFSET(6)] AS profit_amount  
      FROM yourTable
      

      您可以使用以下示例中的虚拟数据对其进行测试

      #standardSQL
      WITH yourTable AS (
        SELECT '30-AUG-14 55 12 34 56 78 99' AS date, NULL AS org, NULL AS cus_id, NULL AS prod_id, NULL AS sales_qty, NULL AS sales_amount, NULL AS profit_amount UNION ALL
        SELECT '31-AUG-14 22 32 43 65 76 88', NULL, NULL, NULL, NULL, NULL, NULL
      )
      SELECT 
        SPLIT(date, ' ')[OFFSET(0)] AS date,        
        SPLIT(date, ' ')[OFFSET(1)] AS org,   
        SPLIT(date, ' ')[OFFSET(2)] AS cus_id,   
        SPLIT(date, ' ')[OFFSET(3)] AS prod_id,   
        SPLIT(date, ' ')[OFFSET(4)] AS sales_qty,   
        SPLIT(date, ' ')[OFFSET(5)] AS sales_amount,   
        SPLIT(date, ' ')[OFFSET(6)] AS profit_amount  
      FROM yourTable  
      

      如果您需要将字段转换为 INT - 请在下面使用

      #standardSQL
      SELECT 
        SPLIT(date, ' ')[OFFSET(0)] AS date,        
        CAST(SPLIT(date, ' ')[OFFSET(1)] AS INT64) AS org,   
        CAST(SPLIT(date, ' ')[OFFSET(2)] AS INT64) AS cus_id,   
        CAST(SPLIT(date, ' ')[OFFSET(3)] AS INT64) AS prod_id,   
        CAST(SPLIT(date, ' ')[OFFSET(4)] AS INT64) AS sales_qty,   
        CAST(SPLIT(date, ' ')[OFFSET(5)] AS INT64) AS sales_amount,   
        CAST(SPLIT(date, ' ')[OFFSET(6)] AS INT64) AS profit_amount  
      FROM yourTable
      

      【讨论】:

        猜你喜欢
        • 2021-03-05
        • 1970-01-01
        • 1970-01-01
        • 2013-03-06
        • 2023-04-10
        • 1970-01-01
        • 2022-08-12
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多