【问题标题】:SUM For Distinct Rows不同行的 SUM
【发布时间】:2015-02-18 15:17:27
【问题描述】:

给定以下表结构:

countries: id, name
regions: id, country_id, name, population
cities: id, region_id, name

...这个查询...

SELECT c.name AS country, COUNT(DISTINCT r.id) AS regions, COUNT(s.id) AS cities
FROM countries AS c
JOIN regions AS r ON r.country_id = c.id
JOIN cities AS s ON s.region_id = r.id
GROUP BY c.id

如何添加regions.population 值的SUM 来计算国家人口?求和时我只需要使用每个区域的值一次,但是未分组的结果对于每个区域(该区域的城市数量)都有多行。

示例数据:

mysql> SELECT * FROM countries;
+----+-----------+
| id | name      |
+----+-----------+
|  1 | country 1 |
|  2 | country 2 |
+----+-----------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM regions;
+----+------------+-----------------------+------------+
| id | country_id | name                  | population |
+----+------------+-----------------------+------------+
| 11 |          1 | region 1 in country 1 |         10 |
| 12 |          1 | region 2 in country 1 |         15 |
| 21 |          2 | region 1 in country 2 |         25 |
+----+------------+-----------------------+------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM cities;
+-----+-----------+---------------------------------+
| id  | region_id | name                            |
+-----+-----------+---------------------------------+
| 111 |        11 | City 1 in region 1 in country 1 |
| 112 |        11 | City 2 in region 1 in country 1 |
| 121 |        12 | City 1 in region 2 in country 1 |
| 211 |        21 | City 1 in region 1 in country 2 |
+-----+-----------+---------------------------------+
4 rows in set (0.00 sec)

带有示例数据的期望输出:

+-----------+---------+--------+------------+
| country   | regions | cities | population |
+-----------+---------+--------+------------+
| country 1 |       2 |      3 |         25 |
| country 2 |       1 |      1 |         25 |
+-----------+---------+--------+------------+

我更喜欢不需要更改 JOIN 逻辑的解决方案。

this postaccepted solution 似乎在我正在寻找的附近,但我无法弄清楚如何将其应用于我的问题。


我的解决方案

SELECT c.id AS country_id,
    c.name AS country,
    COUNT(x.region_id) AS regions,
    SUM(x.population) AS population,
    SUM(x.cities) AS cities
FROM countries AS c
LEFT JOIN (
        SELECT r.country_id,
            r.id AS region_id,
            r.population AS population,
            COUNT(s.id) AS cities
        FROM regions AS r
        LEFT JOIN cities AS s ON s.region_id = r.id
        GROUP BY r.country_id, r.id, r.population
    ) AS x ON x.country_id = c.id
GROUP BY c.id, c.name

注意:我的实际查询要复杂得多,与国家、地区或城市无关。这是说明我的问题的最小示例。

【问题讨论】:

  • 确定该查询?您选择城市名称为“国家”,并按城市 ID 分组?
  • 您需要每个地区或国家的人口吗?
  • 我的错误。我已经更正了别名。我想要这个国家的人口。

标签: mysql sql select group-by sum


【解决方案1】:

使用 LEFT OUTER JOIN 而不是 INNER JOIN,因为如果国家/地区没有地区,那么如果您使用 INNER JOIN,该国家/地区将不会出现>,相同的 wat 如果任何地区没有城市,那么这将不计入结果。

所以使用 LEFT OUTER JOIN 而不是 INNER JOINJOIN

试试这个:

SELECT c.name AS country, r.regions, r.population, r.cities 
FROM countries AS c 
LEFT OUTER JOIN (SELECT r.country_id, 
                        COUNT(r.id) AS regions, 
                        SUM(r.population) AS population, 
                        SUM(c.cities) AS cities
                 FROM regions AS r 
                 LEFT OUTER JOIN (SELECT c.region_id, COUNT(c.id) AS cities 
                                  FROM cities AS C
                                  GROUP BY c.region_id
                                 ) AS c ON r.id = c.region_id 
                 GROUP BY r.country_id
                ) AS r ON c.id = r.country_id;

查看SQL FIDDLE DEMO

输出

| COUNTRY | REGIONS | POPULATION | CITIES |
|---------|---------|------------|--------|
|     usa |       3 |         16 |      4 |
| germany |       2 |          5 |      1 |

【讨论】:

  • 感谢您讨论 LEFTINNER 连接之间的区别。我已经知道了,但这些信息对未来的读者有好处。
【解决方案2】:

您的问题很常见。您连接所有与您想查看的数据有关的表,然后您开始考虑如何获取该数据。当涉及到与您的情况一样的不同聚合时,这并不容易实现。

所以最好加入您真正感兴趣的内容。就您而言:国家和每个国家/地区的(汇总)区域/城市数据。这使查询保持直截了当且易于维护。

select 
  c.name as country, 
  r.regions, 
  r.population,
  r.cities
from countries as c
join 
(
  select 
    country_id,
    count(*) as regions,
    sum(population) as population,
    sum((select count(*) from cities where cities.region_id = regions.id)) as cities
  from regions
  group by country_id
) as r on r.country_id = c.id;

【讨论】:

    【解决方案3】:

    如果您不想引入/更改JOINSUBQUERY,这是另一种方法

    SELECT 
      c.name AS country, 
      COUNT(distinct r.id) AS regions, 
      COUNT(s.id) AS cities,
      SUM(DISTINCT(((((r.id*r.id) + (r.population*r.id)))-(r.id*r.id))/r.id)) as total_population
    FROM 
      countries AS c
      JOIN regions AS r ON r.country_id = c.id
      LEFT JOIN cities AS s ON s.region_id = r.id
    GROUP 
      BY c.id
    

    http://sqlfiddle.com/#!2/3dd8ba/22/0

    【讨论】:

      【解决方案4】:

      首先,您应该知道问题中提到的问题及其解决方案与您的问题及其解决方案略有不同。这就是为什么你不能只使用没有子查询的JOINs。

      表格:

      国家:

      ===========================
      |     id     |    name    |
      ===========================
      |     1      | country 1  |
      ---------------------------
      |     2      | country 2  |
      ---------------------------
      |     3      | country 3  |
      ---------------------------
      |     4      | country 4  |
      ---------------------------
      

      地区:

      =============================================
      |    id    |country_id|   name   |population|
      =============================================
      |    1     |    1     | c1 - r1  |    10    |
      ---------------------------------------------
      |    2     |    1     | c1 - r2  |    15    |
      ---------------------------------------------
      |    3     |    1     | c1 - r3  |    15    |
      ---------------------------------------------
      |    4     |    2     | c2 - r1  |    25    |
      ---------------------------------------------
      |    5     |    3     | c3 - r1  |    13    |
      ---------------------------------------------
      

      城市:

      ========================================
      |     id     | region_id  |    name    |
      ========================================
      |     1      |     1      |   city 1   |
      ----------------------------------------
      |     2      |     1      |   city 2   |
      ----------------------------------------
      |     3      |     2      |   city 3   |
      ----------------------------------------
      |     4      |     2      |   city 4   |
      ----------------------------------------
      |     5      |     2      |   city 5   |
      ----------------------------------------
      |     6      |     3      |   city 6   |
      ----------------------------------------
      |     7      |     3      |   city 7   |
      ----------------------------------------
      |     8      |     4      |   city 8   |
      ----------------------------------------
      |     9      |     4      |   city 9   |
      ----------------------------------------
      |     10     |     4      |  city 10   |
      ----------------------------------------
      

      作为一种简单的方法,您可以将countries 表与连接regionscities 表的子查询连接以获得2 个表:countriesregionscities 列:

      SQL:

      SELECT
          r.id AS id,
          r.country_id AS country_id,
          r.name AS name,
          r.population AS population,
          COUNT(s.region_id) AS cities
      FROM regions r
          /* we use left joint and not only join to get also regions without cities */
          LEFT JOIN cities s
              ON r.id = s.region_id
      GROUP BY r.id
      

      数据:

      ==================================================================
      |     id     | country_id |    name    | population |   cities   |
      ==================================================================
      |     1      |     1      |  c1 - r1   |     10     |     2      |
      ------------------------------------------------------------------
      |     2      |     1      |  c1 - r2   |     15     |     3      |
      ------------------------------------------------------------------
      |     3      |     1      |  c1 - r3   |     15     |     2      |
      ------------------------------------------------------------------
      |     4      |     2      |  c2 - r1   |     25     |     3      |
      ------------------------------------------------------------------
      |     5      |     3      |  c3 - r1   |     13     |     0      |
      ------------------------------------------------------------------  
      

      然后你必须做你的正常请求,它会给你这个代码:

      SQL:

      SELECT
          c.name AS country,
          COUNT(r.country_id) AS regions,
          /* ifnull is used here to show 0 instead of null */
          SUM(IFNULL(r.cities, 0)) AS cities,
          SUM(IFNULL(r.population, 0)) AS population
      FROM countries c
          /* we use left joint and not only join to get also countries without regions */
          LEFT JOIN (
              SELECT
                  /* we don't need regions.id and regions.name */
                  r.country_id AS country_id,
                  r.population AS population,
                  COUNT(s.region_id) AS cities
              FROM regions r
                  LEFT JOIN cities s
                      ON r.id = s.region_id
              GROUP BY r.id
          ) r
          ON c.id = r.country_id
      GROUP BY c.id
      

      这个结果:

      =====================================================
      |  country   |  regions   |   cities   | population |
      =====================================================
      | country 1  |     3      |     7      |     40     |
      -----------------------------------------------------
      | country 2  |     1      |     3      |     25     |
      -----------------------------------------------------
      | country 3  |     1      |     0      |     13     |
      -----------------------------------------------------
      | country 4  |     0      |     0      |     0      |
      -----------------------------------------------------   
      

      为了比较,仅使用 JOIN 会删除没有地区的国家和地区没有城市的国家:

      =====================================================
      |  country   |  regions   |   cities   | population |
      =====================================================
      | country 1  |     3      |     7      |     40     |
      -----------------------------------------------------
      | country 2  |     1      |     3      |     25     |
      -----------------------------------------------------
      

      对于您的确切示例(您的问题中提到的数据),您将获得:

      =====================================================
      |  country   |  regions   |   cities   | population |
      =====================================================
      | country 1  |     2      |     3      |     25     |
      -----------------------------------------------------
      | country 2  |     1      |     1      |     25     |
      -----------------------------------------------------
      

      我希望这一切都可以帮助你得到你想要的。

      【讨论】:

        【解决方案5】:

        我用这个查询在 sql 中测试了你在下面提供的同一个表

        select regioncount.name as country,regioncount.regions, citycount.cities,regioncount.population    from
         (SELECT c.name,c.id,COUNT(r.id) AS regions ,SUM(r.population) as population
        FROM countries AS c
        JOIN regions AS r  on c.id = r.country_id GROUP BY c.id,c.name) as regioncount
        
        join
        
        (SELECT 
        r.country_id,
            COUNT(s.id) AS cities 
        FROM regions AS r
        JOIN cities AS s  on r.id =s.region_id GROUP BY r.country_id) as citycount on citycount.country_id = regioncount.id
        

        我得到了你想要的结果

        +-----------+---------+--------+------------+
        | country   | regions | cities | population |
        +-----------+---------+--------+------------+
        | country 1 |       2 |      3 |         25 |
        | country 2 |       1 |      1 |         25 |
        +-----------+---------+--------+------------+
        

        【讨论】:

          【解决方案6】:

          首先,你引用的other post不是同一种情况。在这种情况下,连接就像 [A -> B 和 A -> C],因此加权平均值(这是计算所做的)是正确的。在你的情况下,连接就像 [A -> B -> C],所以你需要一种不同的方法。

          立即想到的最简单的解决方案确实涉及子查询,但不是复杂的:

          SELECT 
              c.name AS country, 
              COUNT(r.id) AS regions, 
              SUM(s.city_count) AS cities,
              SUM(r.population) as population
          FROM countries AS c
          JOIN regions AS r ON r.country_id = c.id
          JOIN 
              (select region_id, count(*) as city_count
              from cities 
              group by region_id) AS s
          ON s.region_id = r.id
          GROUP BY c.id
          

          这样做的原因是它在加入区域之前将城市解析为每个区域一行,从而消除了交叉连接的情况。

          【讨论】:

          • 感谢您说明为什么其他解决方案不适用于我的情况。
          【解决方案7】:

          留下其余部分并为人口再添加一个连接怎么样

          SELECT c.name AS country, 
                 COUNT(distinct r.id) AS regions, 
                 COUNT(s.id) AS cities, 
                 pop_regs.sum as total_population
          FROM countries AS c
          LEFT JOIN regions AS r ON r.country_id = c.id
          LEFT JOIN cities AS s ON s.region_id = r.id
          left join 
          (
              select country_id, sum(population) as sum 
              from regions 
              group by country_id
          ) pop_regs on pop_regs.country_id = c.id
          GROUP BY c.id, c.name
          

          SQLFiddle demo

          【讨论】:

            猜你喜欢
            • 2011-01-27
            • 1970-01-01
            • 2019-08-10
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2018-06-09
            • 2017-02-09
            相关资源
            最近更新 更多