【问题标题】:How to get the cumulative sum of an aggregate column?如何获得聚合列的累积和?
【发布时间】:2022-01-13 18:40:16
【问题描述】:

我在 BigQuery 中有这个查询,它返回总 contributing_factor_vehicle_1 的表示形式

SELECT
    TBL_TOTAL.contributing_factor_vehicle_1,
    TBL_TOTAL.TOTAL,
    (TBL_TOTAL.TOTAL / SUM(TBL_TOTAL.TOTAL) OVER ()) * 100  AS PERCENTAGE
FROM 
    (SELECT 
         contributing_factor_vehicle_1,
         COUNT(contributing_factor_vehicle_1) AS TOTAL
     FROM 
         `bigquery-public-data.new_york_mv_collisions.nypd_mv_collisions` 
     WHERE 
         borough = 'BROOKLYN' 
         AND contributing_factor_vehicle_1 <> 'Unspecified'
     GROUP BY 
         contributing_factor_vehicle_1
     ORDER BY 
         TOTAL DESC) TBL_TOTAL
ORDER BY 
    TOTAL DESC

输出:

contributing_factor_vehicle_1 TOTAL PERCENTAGE
Driver Inattention/Distraction 65427 28.913538237178777
Failure to Yield Right-of-Way 25831 11.415250679452903
Backing Unsafely 16384 7.240426895286917
Following Too Closely 12605 5.570408997503148
Passing Too Closely 10875 4.805886382217116

现在我需要获取累积的PERCENTAGE 来进行帕累托分析: 请问我该如何实现?是否可以在窗口函数中再次使用PERCENTAGE 列?

contributing_factor_vehicle_1 TOTAL PERCENTAGE PERCENTAGE CUM
Driver Inattention/Distraction 65427 28.91% 28.91%
Failure to Yield Right-of-Way 25831 11.42% 40.33%
Backing Unsafely 16384 7.24% 47.57%
Following Too Closely 12605 5.57% 53.14%
Passing Too Closely 10875 4.81% 57.95%

【问题讨论】:

    标签: sql google-bigquery window-functions


    【解决方案1】:

    只需在外部 SELECT 中再添加一行,如下例所示

    SELECT
      TBL_TOTAL.contributing_factor_vehicle_1,
      TBL_TOTAL.TOTAL,
      ROUND((TBL_TOTAL.TOTAL/SUM(TBL_TOTAL.TOTAL) OVER ())* 100, 2)  AS PERCENTAGE,
      ROUND(((SUM(TBL_TOTAL.TOTAL) OVER (ORDER BY TOTAL DESC))/SUM(TBL_TOTAL.TOTAL) OVER ())* 100, 2) AS PERCENTAGE_CUM
    FROM 
    (
        SELECT 
        contributing_factor_vehicle_1,
        COUNT(contributing_factor_vehicle_1) AS TOTAL
        FROM `bigquery-public-data.new_york_mv_collisions.nypd_mv_collisions` 
        WHERE borough = 'BROOKLYN' AND contributing_factor_vehicle_1 <> 'Unspecified'
        GROUP BY contributing_factor_vehicle_1
        ORDER BY TOTAL DESC 
    ) TBL_TOTAL
    
    ORDER BY TOTAL DESC             
    

    有输出

    【讨论】:

      猜你喜欢
      • 2019-03-25
      • 2016-07-28
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-03-08
      相关资源
      最近更新 更多