【问题标题】:Unable to calculate median - SQL Server 2017无法计算中位数 - SQL Server 2017
【发布时间】:2020-08-08 06:20:33
【问题描述】:

我正在尝试计算每个类别中交易的中位数。 一些注意事项(因为下面的数据集是一个更大数据集的一个小 sn-p):

  • 一个员工可以属于多个类别
  • 每笔交易的中位数应 > 0
  • 并非每个人都出现在每个类别中

数据是这样设置的:

|  Person | Category | Transaction |
|:-------:|:--------:|:-----------:|
| PersonA |   Sales  |      27     |
| PersonB |   Sales  |      75     |
| PersonC |   Sales  |      87     |
| PersonD |   Sales  |      36     |
| PersonE |   Sales  |      70     |
| PersonB |   Buys   |      60     |
| PersonC |   Buys   |      92     |
| PersonD |   Buys   |      39     |
| PersonA |    HR    |      59     |
| PersonB |    HR    |      53     |
| PersonC |    HR    |      98     |
| PersonD |    HR    |      54     |
| PersonE |    HR    |      70     |
| PersonA |   Other  |      46     |
| PersonC |   Other  |      66     |
| PersonD |   Other  |      76     |
| PersonB |   Other  |      2      |

理想的输出应该是这样的:

| Category | Median | Average |
|:--------:|:------:|:-------:|
|   Sales  |   70   |    59   |
|   Buys   |   60   |    64   |
|    HR    |   59   |    67   |
|   Other  |   56   |    48   |

我可以通过以下方式获得平均值:

SELECT
    Category,
    AVG(Transaction) AS Average_Transactions
FROM
    table
GROUP BY
    Category

效果很好!

This post 试图帮助我找到中位数。我写的是:

SELECT
    Category,
    PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Transaction) OVER (PARTITION BY Category) AS Median_Transactions
FROM
    table
GROUP BY
    Category

但我得到一个错误:

Msg 8120: Column 'Transactions' is invalid in the select list because it is not contained in either an aggregate function or the **GROUP BY** clause

我该如何解决这个问题?

【问题讨论】:

    标签: sql sql-server aggregate-functions median


    【解决方案1】:

    我不认为这很漂亮,但它确实有效。我没有花时间打磨它

    with 
    avg_t as 
    ( select category, avg(sales) as avg_sales 
      from sample 
       group by 1), 
    mn as 
    (  select category, avg(sales) as median_sales 
       from (
         select category, sales , 
                row_number() over  (partition by category order by sales asc) as r , 
                count(person) over (partition by category) as total_count 
         from  sample 
             ) mn_sub 
        where (total_count % 2 = 0 and r in ( (total_count/2), ((total_count/2)+1)) ) or 
               (total_count % 2 <> 0 and r = ((total_count+1)/2))
             
        group by 1 
    )
    select avg_t.category, avg_t.avg_sales, mn.median_sales  
     from avg_t 
     inner join mn 
     on avg_t.category=mn.category
    

    【讨论】:

      【解决方案2】:

      这不是最佳的,但这是您的解决方案

      SELECT DISTINCT
             category,
             PERCENTILE_DISC(0.5)WITHIN GROUP(ORDER BY val) OVER (PARTITION BY category) AS Median_Transactions,
             AVG(val) OVER (PARTITION BY d.category)  [AVG]
      FROM #data d;
      

      【讨论】:

        【解决方案3】:

        你可以使用SELECT DISTINCT做你想做的事:

        SELECT DISTINCT Category,
               PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Transaction) OVER (PARTITION BY Category) AS Median_Transactions
        FROM table;
        

        不幸的是,SQL Server 不提供 PERCENTILE_ 函数作为窗口函数,也没有 MEDIAN() 聚合函数。您也可以使用子查询和计数来做到这一点。

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2010-11-23
          • 1970-01-01
          • 2015-11-14
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2018-03-15
          相关资源
          最近更新 更多