【问题标题】:Sum results on constant timeframe range on each date in table对表格中每个日期的恒定时间范围范围内的结果求和
【发布时间】:2020-12-30 16:11:55
【问题描述】:

我正在使用 PostGres DB。
我有一个包含测试名称、结果和报告时间的表格:

|test_name|result |report_time|
|    A    |error  |29/11/2020 |
|    A    |failure|28/12/2020 |
|    A    |error  |29/12/2020 |
|    B    |passed |30/12/2020 |
|    C    |failure|31/12/2020 |
|    A    |error  |31/12/2020 |

我想总结过去 30 天内每个日期有多少测试失败或出错(并将其限制为从当前日期起 5 天),因此最终结果将是:

|    date    | sum |  (notes)
| 29/11/2020 |  1  | 1 failed/errored test in range (29/11 -> 29/10)
| 28/12/2020 |  2  | 2 failed/errored tests in range (28/12 -> 28/11)
| 29/12/2020 |  3  | 3 failed/errored tests in range (29/12 -> 29/11)
| 30/12/2020 |  2  | 2 failed/errored tests in range (30/12 -> 30/11)
| 31/12/2020 |  4  | 4 failed/errored tests in range (31/12 -> 31/11)

我知道如何对每个日期的结果求和(即特定日期有多少失败/错误):

SELECT report_time::date AS "Report Time", count(case when result in ('failure', 'error') then 1 else 
null end) from table
where report_time::date = now()::date
GROUP BY report_time::date, count(case when result in ('failure', 'error') then 1 else null end)

但我正在努力总结 30 天前的每个日期。

【问题讨论】:

  • 您不能将聚合函数放入group by,因为在分组之前没有任何计数

标签: sql postgresql sum aggregate


【解决方案1】:

您可以生成日期,然后使用窗口函数:

select gs.dte, num_failed_error, num_failed_error_30
from genereate_series(current_date - interval '5 day', current_date, interval '1 day') gs(dte) left join
     (select t.report_time, count(*) as num_failed_error,
             sum(count(*)) over (order by report_time range between interval '30 day' preceding and current row) as num_failed_error_30
      from t
      where t.result in ('failed', 'error') and
            t.report_time >= current_date - interval '35 day'
      group by t.report_time
     ) t
     on t.report_time = gs.dte ;

注意:这里假定report_time 只是没有时间成分的日期。如果它有时间组件,请使用report_time::date

如果你有每天的数据,那么这可以简化为:

select t.report_time, count(*) as num_failed_error,
       sum(count(*)) over (order by report_time range between interval '30 day' preceding and current row) as num_failed_error_30
from t
 where t.result in ('failed', 'error') and
        t.report_time >= current_date - interval '35 day'
 group by t.report_time
 order by report_time desc
 limit 5;

【讨论】:

  • 我收到一个错误RANGE PRECEDING is only supported with UNBOUNDED。难道是PostGres不支持?
  • @ocp1000 。 . . Postgres 从版本 11 (dbfiddle.uk/…) 开始支持 range preceding
  • 看起来我在不支持它的 10.12 上。你能解释一下窗口函数到底发生了什么吗?我会看看如何在 10.12 中实现。
【解决方案2】:

由于我使用的是 PostGresSql 10.12 并且当前无法选择更新,因此我采用了不同的方法,计算过去 30 天的日期,并为每个日期计算过去 30 天的累积不同总和:

SELECT days_range::date, SUM(number_of_tests)
FROM   generate_series (now() - interval '30 day', now()::timestamp , '1 day'::interval) days_range
CROSS  JOIN LATERAL (
        SELECT environment, COUNT(DISTINCT(test_name)) as number_of_tests from tests
        WHERE report_time > days_range - interval '30 day'
        GROUP BY report_time::date
        HAVING COUNT(case when result in ('failure', 'error') then 1 else null end) > 0
        ORDER BY report_time::date asc
    ) as lateral_query
GROUP BY days_range
ORDER BY days_range desc

这绝对不是最优化的查询,它需要大约 1 分钟的时间来计算。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-10-09
    • 1970-01-01
    • 2014-01-07
    • 1970-01-01
    • 2023-03-20
    相关资源
    最近更新 更多