【问题标题】:how to count aggregate distinct IDs Per Hour Since Beginning of the day?如何计算自一天开始以来每小时的聚合不同 ID?
【发布时间】:2021-04-27 11:22:37
【问题描述】:

我需要编写一个查询,它计算不同的 ID 并随着时间的推移聚合它们。

例如

结果应该是

【问题讨论】:

    标签: sql aggregate distinct vertica


    【解决方案1】:

    对于每个id,记录id第一次出现的时间,然后进行累加:

    select hour, sum(count(*)) over (partition by day order by hour)
    from (select day, id, min(hour) as hour
          from t
          group by day, id
         ) t
    group by hour
    order by hour;
    

    注意:这假设您确实需要给定一天内的小时。

    你也可以这样表达:

    select day, hour, sum(cnt) over (partition by day order by hour)
    from (select day, hour, count(*) as cnt
          from (select day, id, min(hour) as hour
                from t
                group by day, id
               ) t
          group by hour
         ) h
    order by hour;
    

    上面不包括一个小时,除非那个小时有一个新的 id。对于所有时间,您都可以使用窗口函数:

    select hour,
           sum(sum( (seqnum = 1)::int ) over (partition by day order by hour)
    from (select day, id,
                 row_number() over (partition by day, id order by hour) as seqnum
          from t
         ) t
    group by hour
    order by hour;
    

    你也可以这样表达:

    select day, hour, sum(cnt) over (partition by day order by hour)
    from (select day, hour, sum( (seqnum = 1)::int ) as cnt
          from (select day, id,
                       row_number() over (partition by day, id order by hour) as seqnum
                from t
               ) t
          group by hour, day
         ) dh
    order by hour;
    

    【讨论】:

    • 1.它不在 vertica 2 中运行。您错过了内部查询中的小时列我收到此错误“错误:聚合函数调用不能包含分析函数调用”
    • @erezlale 。 . .哇!这是一个非常神秘的限制。我从未在其他数据库中听说过这样的问题——包括为 Vertica 提供代码库的 Postgres。
    【解决方案2】:

    我每小时的计数不同。

    declare @tmp as table(RecordNumber int, Day date, Hour int, ID int)
    insert into @tmp(RecordNumber,Day,Hour,ID) values(1,'4/27/2021',1,100),
    (2,'4/27/2021',1,100)
    ,(3,'4/27/2021',1,101)
    ,(4,'4/27/2021',2,202)
    ,(5,'4/27/2021',2,203)
    ,(6,'4/27/2021',3,101)
    ,(7,'4/27/2021',8,305)
    ,(8,'4/27/2021',16,100)
    ,(9,'4/27/2021',22,30)
    ,(10,'4/27/2021',22,1000)
    ,(11,'4/27/2021',23,100)
    ,(12,'4/27/2021',23,50)
    
    
    select distinct Day, Hour, count(distinct ID) ID_Count from @tmp
    group by Day, Hour
    

    输出

     Hour   ID_Count
       1    2
       2    2
       3    1
       8    1
      16    1
      22    2
      23    2
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-02-17
      • 1970-01-01
      • 2019-04-17
      • 2018-07-30
      • 2019-07-28
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多