【问题标题】:Running total percentage per record over total sum. Presto/Athena/SQL每条记录的运行总百分比占总和。 Presto/雅典娜/SQL
【发布时间】:2021-12-06 10:25:33
【问题描述】:

我正在尝试计算每行 Presto/Athena 的累积百分比。例如:如果我有这样的数据

AccountID | UserID | HolidaysTaken
ABC       | A      | 4
ABC       | B      | 6
ABC       | B      | 3
ABC       | K      | 2
ABC       | K      | 3
ABC       | X      | 1

现在运行此查询后,我得到以下结果。

SELECT AccountID, UserID, sum(HolidaysTaken) AS HolidaysTaken FROM table
WHERE AccountID = 'ABC'
GROUP BY AccountID, UserID
ORDER BY HolidaysTaken DESC

AccountID | UserID | HolidaysTaken 
ABC       | B      | 9             
ABC       | K      | 5             
ABC       | A      | 4             
ABC       | X      | 1 

Total holiday taken by all users = 19         

但我想再添加 2 列。 EachUserPercentage:每个用户的假期占总假期的百分比。 CumulativePercentage:EachUserPercentage 的累计值。这个我可以使用this post

AccountID | UserID | HolidaysTaken | EachUserPercentage | CumulativePercentage
ABC       | B      | 9             | 47.36              | 47.36  
ABC       | K      | 5             | 26.31              | 73.67
ABC       | A      | 4             | 21.05              | 94.72
ABC       | X      | 1             | 5.26               | 100

我尝试了差异窗口函数percent_rank(), cume_dist() and ntile(),但无法正常工作EachUserPercentage

【问题讨论】:

    标签: sql amazon-athena presto


    【解决方案1】:

    您可以使用窗口函数来查找 AccountID 的百分比,然后使用另一个窗口函数来对未绑定的行求和,然后按每个 UserID 的总假期数排序。如下所示:

    WITH totalUser
    AS (SELECT   AccountID
                ,UserID
                ,SUM(HolidaysTaken) AS HolidaysTaken
                ,CAST(100.0 * SUM(HolidaysTaken) / SUM(SUM(HolidaysTaken)) OVER (PARTITION BY AccountID) AS NUMERIC(5, 2)) AS EachUserPercentage
        FROM     table
        WHERE    AccountID = 'ABC'
        GROUP BY AccountID
                ,UserID)
    SELECT   totalUser.AccountID
            ,totalUser.UserID
            ,totalUser.HolidaysTaken
            ,totalUser.EachUserPercentage
            ,SUM(totalUser.EachUserPercentage) OVER (PARTITION BY totalUser.AccountID
                                                     ORDER BY totalUser.EachUserPercentage DESC
                                                     ROWS UNBOUNDED PRECEDING)
    FROM     totalUser
    ORDER BY totalUser.HolidaysTaken DESC;
    

    【讨论】:

    • 你是对的@MatBailie。我已经进行了相应的编辑!谢谢
    • 演示:dbfiddle.uk/…
    【解决方案2】:

    您好,如果您的群组在 AccountID(已考虑)上,您可以从以下查询中简单地获取 EachUserPercentage。

    SELECT table.AccountID, UserID, sum(table.HolidaysTaken) AS HolidaysTaken,
    MAX(CAST(all_sum.HolidaysTaken AS NUMERIC(12,2))),
    (SUM(CAST(table.HolidaysTaken AS NUMERIC(12,2)))/MAX(CAST(all_sum.HolidaysTaken AS NUMERIC(12,2))))*100 EachUserPercentage
     FROM table
    LEFT OUTER JOIN (SELECT SUM(HolidaysTaken) AS HolidaysTaken,AccountID FROM table GROUP BY AccountID)all_sum ON all_sum.AccountID= table.AccountID
    WHERE table.AccountID = 'ABC'
    GROUP BY table.AccountID, UserID
    ORDER BY HolidaysTaken DESC
    

    它对我有用。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-10-31
      • 2017-05-29
      • 1970-01-01
      • 1970-01-01
      • 2021-09-05
      • 2023-02-18
      • 2020-03-27
      • 1970-01-01
      相关资源
      最近更新 更多