【问题标题】:BigQuery SQL : Decrease remaining minutes counter in successive rows calculation [duplicate]BigQuery SQL:减少连续行计算中的剩余分钟数计数器[重复]
【发布时间】:2021-01-19 23:12:37
【问题描述】:

我有一些带有日期和聚合 minutes 计数器时间的日志。

每个小时(一排)都被这样一个技巧填满:Duplicate groups of records to fill multiple date gaps in Google BigQuery

问题:

我想在 minutes 仍然可用的情况下完成 time 列(在此处了解,每小时最多 60 分钟)

这是所需的输出: remainingTime 是由前几行产生的。假设remainingTime = 70 分钟

#standardSQL
WITH history AS (
  SELECT '2017-01-01' AS date, 'a' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-02' AS date, 'a' AS product, 100 AS minutes UNION ALL
  SELECT '2017-01-03' AS date, 'a' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-04' AS date, 'a' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-05' AS date, 'a' AS product, 30 AS minutes UNION ALL
  SELECT '2017-01-06' AS date, 'a' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-01' AS date, 'b' AS product, 100 AS minutes UNION ALL
  SELECT '2017-01-02' AS date, 'b' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-03' AS date, 'b' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-04' AS date, 'b' AS product, 0 AS minutes UNION ALL
  SELECT '2017-01-05' AS date, 'b' AS product, 0 AS minutes 

+---------+------------+---------+---------------+---------------------+
| product |    date    | minutes | remainingTime |       time          |
+---------+------------+---------+---------------------------------+
|    a    | 2017-01-01 |   0     |  10           | 60 (max 60 reached) | // 0 + 70 - 60 = 10
|    a    | 2017-01-02 |   100   |  50           | 60 (same)           | // 100 + 10 - 60 = 50
|    a    | 2017-01-03 |   0     |  0            | 50 (only 50/60)     | // 0 + 50 - 50 = 0            
|    a    | 2017-01-04 |   0     |  0            | 0 (and so on)       | // 0 + 0 - 0 = 0
|    a    | 2017-01-05 |   30    |  0            | 30                  | // 30 + 0 - 30 = 0
|    a    | 2017-01-06 |   0     |  0            | 0                   | // 0 + 0 - 0 = 0
 

... and so on for other products
+---------------+--------+------+

我几乎完成了一个复杂而丑陋的查询,但我目前遇到了一个临时计算列..

(PS:我多年没有练习 SQL,所以我正在重新学习基础知识并同时发现 BigQuery 标准 SQL)

谢谢!

【问题讨论】:

    标签: sql google-bigquery popsql


    【解决方案1】:

    BigQuery 本身不支持递归操作。尝试将array_agg()JavaScript user-defined function 结合使用,但这种方法的可扩展性不是很高:

    CREATE TEMP FUNCTION special_sum(x ARRAY<INT64>)
    RETURNS INT64
    LANGUAGE js
    AS """
      var remaining = 70;
      var time = 0;
      for (const num of x)
      {
         time = Math.min(parseInt(num) + remaining, 60);
         remaining = parseInt(num) + remaining - time;
      }
      return time;
    """;
    
    WITH history AS (
      SELECT '2017-01-01' AS date, 'a' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-02' AS date, 'a' AS product, 100 AS minutes UNION ALL
      SELECT '2017-01-03' AS date, 'a' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-04' AS date, 'a' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-05' AS date, 'a' AS product, 30 AS minutes UNION ALL
      SELECT '2017-01-06' AS date, 'a' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-01' AS date, 'b' AS product, 100 AS minutes UNION ALL
      SELECT '2017-01-02' AS date, 'b' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-03' AS date, 'b' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-04' AS date, 'b' AS product, 0 AS minutes UNION ALL
      SELECT '2017-01-05' AS date, 'b' AS product, 0 AS minutes 
    )
    select *, 
     special_sum(array_agg(minutes) over (partition by product order by date rows unbounded preceding)) as time 
    from history
    

    【讨论】:

    • TY 谢尔盖!我赞成您的解决方案,因为它运行良好!看起来很有趣.. 我一直在寻找一种“最聪明”的方式来实现这一目标
    猜你喜欢
    • 2013-06-26
    • 2014-06-11
    • 1970-01-01
    • 2014-07-30
    • 1970-01-01
    • 2020-08-22
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多