【问题标题】:Efficient SQL query to add totals and existence aggregates from a many-many table从多对多表中添加总计和存在聚合的高效 SQL 查询
【发布时间】:2019-07-15 23:56:21
【问题描述】:

我有两张桌子

  • 杂志与字段id, name;
  • Subscriptions 包含字段 subscriber_idmagazine_id,它们是来自 Magazines 和另一个订阅者表的外键。

我想将 Magazines 表扩展为:

  • 第三列列出其订阅者
  • 的总数
  • 第四列 subscribed 有一个布尔值,如果 Subscriptions 中存在一条记录,且我的subscriber_id 为真(我们可以说我是 subscriber_id=1 )。

是否有解决这些问题的规范方法?如果没有,我怎样才能有效地做到这一点?

要获得订阅者列,我想到的最简单的方法是不列出没有订阅者的杂志。 (我确实读过here,但看不出它是如何相关的)

SELECT magazine_id, COUNT(*)
FROM subscriptions GROUP BY magazine_id
ORDER BY magazine_id ASC;

所以我做了一些有点老套而且可能效率低下的事情:

SELECT
 magazines.id,
 CASE WHEN NOT EXISTS(
  SELECT * FROM subscriptions WHERE magazines.id=magazine_id
 ) THEN 0
 ELSE COUNT(*) END AS subscribers
FROM
 magazines
LEFT JOIN
  subscriptions
ON
 magazines.id = subscriptions.magazine_id
GROUP BY
 magazines.id
ORDER BY
 magazines.id ASC;

对于布尔型第四列,我得到了一个解决方案,方法是再次将先前的结果与订阅一起左连接,并为subscriber_id 的值做 case when is null。

我正在使用 PostgreSQL。我不是学生,但我想学习如何解决这个问题,杂志/订阅者只是一个例子。我只知道 SQL 的基础知识。

【问题讨论】:

    标签: sql aggregate-functions rdbms query-performance sqlperformance


    【解决方案1】:

    第一件事是获取订阅者数量的查询并将其加入杂志表。

    所以

    SELECT id, COUNT(*) AS subscriber_count
    FROM subscriptions GROUP BY magazine_id
    

    变成

    SELECT mag.id, COALESCE(subs.subscriber_count, 0) AS subscribers
    FROM magazines mag
    LEFT JOIN (
        SELECT magazine_id, COUNT(*) AS subscriber_count 
        FROM subscriptions GROUP BY magazine_id) subs ON subs.magazine.id = mag.id
    ORDER BY mag.id ASC
    

    如果您是订阅者,则可以通过以下几种方式进行添加: 1)与上面类似,创建一个查询并将其加入杂志

    --Query:
    SELECT DISTINCT magazine_id FROM subscriptions WHERE subscriber_id = 1
    
    --Joined:
    SELECT 
        mag.id, 
        COALESCE(subs.subscriber_count, 0) AS subscribers,
        CASE WHEN mysubs.magazine_id IS NULL THEN 0 ELSE 1 END AS my_subscription
    FROM magazines mag
    LEFT JOIN (
        SELECT magazine_id, COUNT(*) AS subscriber_count 
        FROM subscriptions GROUP BY magazine_id) subs ON subs.magazine_id = mag.id
    LEFT JOIN (
        SELECT DISTINCT magazine_id FROM subscriptions WHERE subscriber_id = 1
        ) mysubs ON mysubs.magazine_id = mag.id
    ORDER BY mag.id ASC
    

    2) 如果订阅表在杂志 ID 和订阅者 ID 上是唯一的,您可以直接加入而无需子查询:

    SELECT 
        mag.id, 
        COALESCE(subs.subscriber_count, 0) AS subscribers,
        CASE WHEN mysubs.magazine_id IS NULL THEN 0 ELSE 1 END AS my_subscription
    FROM magazines mag
    LEFT JOIN (
        SELECT magazine_id, COUNT(*) AS subscriber_count 
        FROM subscriptions GROUP BY magazine_id) subs ON subs.magazine_id = mag.id
    LEFT JOIN subscriptions mysubs on mysubs.magazine_id = mag.id AND subscriber_id = 1
    ORDER BY mag.id ASC
    

    3) 使用 bool 上的 MAX 聚合将条件添加到订阅子查询

    SELECT 
        mag.id, 
        COALESCE(subs.subscriber_count, 0) AS subscribers,
        COALESCE(subs.mysub, 0) AS my_subscription
    FROM magazines mag
    LEFT JOIN (
        SELECT magazine_id, COUNT(*) AS subscriber_count,
            MAX(CASE WHEN subscriber_id = 1 THEN 1 ELSE 0 END) AS mysub
        FROM subscriptions GROUP BY magazine_id) subs ON subs.magazine_id = mag.id
    ORDER BY mag.id ASC
    

    【讨论】:

      【解决方案2】:

      如果你只想要杂志ID,你可以这样做:

      select magazine_id, count(*) as num_subscribers,
             max(case when subscriber_id = 1 then 1 else 0 end) as my_subscriber
      from subscriptions s
      group by magazine_id;
      

      也就是说,如果您只需要杂志 ID,则不需要 join

      注意:这只会返回有订阅者的杂志。

      这可以通过left join 轻松扩展:

      select m.mag, count(s.magazine_id) as num_subscribers,
             max(case when s.subscriber_id = 1 then 1 else 0 end) as my_subscriber
      from magazines m left join
           subscriptions s
           on s.magazine_id = m.mag_id
      group by m.mag_id;
      

      【讨论】:

        猜你喜欢
        • 2012-05-12
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-06-29
        • 1970-01-01
        • 2019-12-04
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多