【问题标题】:Get the maximum value from rows in Postgres records and group by multiple columns从 Postgres 记录中的行中获取最大值并按多列分组
【发布时间】:2021-04-06 00:06:45
【问题描述】:

我有一张这样的桌子:

p_id |      createdat      | pagetitle | sessionid |      text       | device  | deviceserial
------+---------------------+-----------+-----------+-----------------+---------+--------------
      | 2020-11-27 08:07:39 |           |           | App launch      | android | 636363636890
      | 2020-09-01 08:08:18 |           |           | search          | Android | 636363636890
      | 2020-09-02 08:10:10 |           |           | scan            | Android | 636363636890
      | 2020-09-02 08:12:10 |           |           | destroy         | Android | 636363636890
      | 2020-09-02 08:40:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:45:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:43:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 08:50:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:47:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 08:53:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:50:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 08:55:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:52:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:00:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:55:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:05:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 08:59:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:07:11 |           |           | hi              | Android | 6625839827
      | 2020-09-02 09:01:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:09:11 |           |           | hi              | IOS     | 6625839827
      | 2020-09-02 09:03:10 |           |           | launchComponent | Android | 636363636890
      | 2020-09-02 09:09:11 |           |           | hi              | Android | 6625839828
      | 2020-09-02 09:03:10 |           |           | launchComponent | IOS     | 636363636891
      | 2020-09-02 09:13:11 |           |           | hi              | Android | 6625839828
      | 2020-09-02 09:06:10 |           |           | launchComponent | IOS     | 636363636891

从这张表中,我想实现这样的目标:

deviceserial | event_count |         hr             device
--------------+-------------+---------------------+---------------------
 6625839828   |           2 | 2020-09-02 09:00:00 |Android
 636363636890 |           8 | 2020-09-02 08:00:00 |Android
 636363636891 |           2 | 2020-09-02 09:00:00 |IOS
 6625839827   |           5 | 2020-09-02 08:00:00 |IOS
 

这是我的步骤:我按设备序列的记录分组,每小时作为小时,设备和计数最大值(事件计数)。

我试过这个查询:

select deviceserial,max(event_count) as event_count,hr,device
from (
    select deviceserial,count(*) as event_count,
        date_trunc('hour', createdat) as hr,device
    from devices  
    group by deviceserial,hr,device
) t
group by deviceserial,hr,device

这是我的结果:

 deviceserial | event_count |         hr            device
--------------+-------------+---------------------+---------------------
636363636890      1          2020-11-27 08:00:00  |        android
636363636891      2          2020-09-02 09:00:00  |        IOS
6625839827        4          2020-09-02 09:00:00  |        IOS
6625839827        5          2020-09-02 08:00:00  |        IOS
636363636890      8          2020-09-02 08:00:00  |       Android
636363636890      1          2020-09-01 08:00:00  |       Android
636363636890      2          2020-09-02 09:00:00  |       Android
6625839828        2          2020-09-02 09:00:00  |       Android

【问题讨论】:

    标签: sql postgresql count aggregate-functions greatest-n-per-group


    【解决方案1】:

    如果我没听错,你可以使用distinct on

    select distinct on (deviceserial) 
        deviceserial,
        count(*) as event_count,
        date_trunc('hour', createdat) as hr,
        device
    from devices  
    group by deviceserial, hr, device
    order by deviceserial, event_count desc
    

    这为您提供了每个设备序列发生最多事件的时间/设备。但是请注意,这不能正确处理关系(这只会为每个设备序列提供一行)。如果你想允许顶级关系,你可以使用 rank() 代替:

    select *
    from (
        select deviceserial,
            count(*) as event_count,
            date_trunc('hour', createdat) as hr,
            device,
            rank() over(partition by deviceserial order by event_count desc) rn
        from devices  
        group by deviceserial, hr, device
    ) t
    where rn = 1
    order by deviceserial
    

    或者,在 Postgres 13 中:

    select deviceserial,
        count(*) as event_count,
        date_trunc('hour', createdat) as hr,
        device
    from devices  
    group by deviceserial, hr, device
    order by rank() over(partition by deviceserial order by event_count desc)
    fetch first row with ties
    

    【讨论】:

      【解决方案2】:

      您可以使用窗口函数rank()如下:

      select * from
      (select deviceserial,count(*) as event_count,
              date_trunc('hour', createdat) as hr, device,
              rank() over (partition by deviceserial order by count(*) desc) as rn
          from devices  
          group by deviceserial,hr,device)
      where rn = 1
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2021-05-24
        • 2013-06-14
        • 1970-01-01
        • 2022-12-04
        • 1970-01-01
        • 2021-09-20
        • 2021-11-04
        相关资源
        最近更新 更多