【问题标题】:GROUP BY CASE with overlapping ranges具有重叠范围的 GROUP BY CASE
【发布时间】:2021-02-11 13:50:32
【问题描述】:

表:

CREATE TABLE companies (name VARCHAR(100) NOT NULL, number_of_employees INT NOT NULL, yearly_revenue DECIMAL(30, 2) NOT NULL);
INSERT INTO companies(name, number_of_employees, yearly_revenue) VALUES ('A', 5, 100), ('B', 5, 150), ('C', 15, 200), ('D', 25, 500), ('E', 25, 800);

如果我想计算公司,按员工人数分组,那么我可以编写这个查询:

SELECT
    COUNT(*) AS count,
    CASE
        WHEN number_of_employees BETWEEN 0 AND 10 THEN '0-10'
        WHEN number_of_employees BETWEEN 10 AND 20 THEN '10-20'
        WHEN number_of_employees > 20 THEN '20-INF'
    END AS number_of_employees
FROM companies
GROUP BY 
    CASE
        WHEN number_of_employees BETWEEN 0 AND 10 THEN '0-10'
        WHEN number_of_employees BETWEEN 10 AND 20 THEN '10-20'
        WHEN number_of_employees > 20 THEN '20-INF'
    END

它会输出:

count   number_of_employees
------------------------------
2       0-10
1       10-20
2       20-INF

现在我的目标是在分组范围重叠时计算公司的数量。示例查询来说明我的目标:

SELECT
    COUNT(*) AS count,
    CASE
        WHEN number_of_employees > 0 THEN '> 0'
        WHEN number_of_employees > 10 THEN '> 10'
        WHEN number_of_employees > 20 THEN '> 20'
    END AS number_of_employees
FROM companies
GROUP BY 
    CASE
        WHEN number_of_employees > 0 THEN '> 0'
        WHEN number_of_employees > 10 THEN '> 10'
        WHEN number_of_employees > 20 THEN '> 20'
    END

它会输出:

count   number_of_employees
------------------------------
5       > 0

但我想要的结果是这样的:

count   number_of_employees
------------------------------
5       > 0
3       > 10
2       > 20

我应该如何做到这一点?

额外问题:除了COUNT(*),我还想知道其他聚合函数的答案,尤其是AVERAGE

【问题讨论】:

  • 请尽量避免将字符串插入数字列。您将年收入定义为 DECIMAL(30,2),然后插入值,例如 '100',它们是字符串...

标签: mysql sql mariadb


【解决方案1】:

我想我会为这个特定问题使用累积和:

SELECT (CASE WHEN number_of_employees > 0 THEN '> 0'
             WHEN number_of_employees > 10 THEN '> 10'
             WHEN number_of_employees > 20 THEN '> 20'
        END) AS grouping
       SUM(COUNT(*)) OVER (ORDER BY MIN(number_of_employees)) AS count   
FROM companies
GROUP BY grouping
ORDER BY MIN(number_of_employees);

【讨论】:

  • 此代码不起作用。无论如何,窗口函数在 GROUP BY 之后执行。
  • @user2180613 。 . .提供示例数据,我将证明它有效。
【解决方案2】:

将您的数据加入到映射表中,并按映射表列分组。

SELECT
  map.label,
  COUNT(*)
FROM
(
  SELECT  0 AS min_emp, '>  0' AS label
  UNION ALL
  SELECT 10 AS min_emp, '> 10' AS label
  UNION ALL
  SELECT 20 AS min_emp, '> 20' AS label
)
  AS map
LEFT JOIN
  companies
    ON companies.number_of_employees > map.min_emp
GROUP BY
  map.label

这样,一个有 21 名员工的公司加入了所有三个类别,但只有 1 名员工的公司只加入了第一个类别。

你甚至可以有任意范围。 (注意,我使用 >= AND

SELECT
  map.label,
  COUNT(*)
FROM
(
  SELECT  0 AS min_emp, 15 AS max_emp, '>=  0 AND < 15' AS label
  UNION ALL
  SELECT 10 AS min_emp, 25 AS max_emp, '>= 10 AND < 25' AS label
  UNION ALL
  SELECT 20 AS min_emp, 99 AS max_emp, '>= 20 AND < 99' AS label
)
  AS map
LEFT JOIN
  companies
    ON  companies.number_of_employees >= map.min_emp
    AND companies.number_of_employees <  map.max_emp
GROUP BY
  map.label

【讨论】:

  • 加入一个虚构的表以将出现在多个组中的行相乘,然后聚合并应用 GROUP BY。很好的解决方案。
猜你喜欢
  • 1970-01-01
  • 2017-06-04
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2010-11-15
  • 1970-01-01
  • 2017-03-22
相关资源
最近更新 更多