【问题标题】:SQL Count over multiple fieldsSQL 计数多个字段
【发布时间】:2011-07-23 12:16:04
【问题描述】:

这是上一个问题的后续问题:Complicated COUNT query in MySQL。没有一个答案在所有条件下都有效,而且我也很难找到解决方案。我将向第一个提供完全正确答案的人奖励 75 分奖励(我将在奖励可用时立即奖励,作为参考,我之前已经这样做过:Improving Python/django view code)。

我想获取用户拥有的视频积分计数并且不允许重复(即,对于每个视频,用户可以在其中获得 0 或 1 次积分。我想找到三个计数:用户的视频数量已上传(简单)--Uploads;用户未上传的视频中计入的视频数--Credited_by_others;以及用户已计入的视频总数--Total_credits

我有三张桌子:

CREATE TABLE `userprofile_userprofile` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `full_name` varchar(100) NOT NULL,
   ...
 )

CREATE TABLE `videos_video` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` int(11) NOT NULL,
  `uploaded_by_id` int(11) NOT NULL,
  ...
  KEY `userprofile_video_e43a31e7` (`uploaded_by_id`),
  CONSTRAINT `uploaded_by_id_refs_id_492ba9396be0968c` FOREIGN KEY (`uploaded_by_id`) REFERENCES `userprofile_userprofile` (`id`)
)

注意uploaded_by_iduserprofile.id相同

CREATE TABLE `videos_videocredit` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `video_id` int(11) NOT NULL,
  `profile_id` int(11) DEFAULT NULL,
  `position` int(11) NOT NULL
  ...
  KEY `videos_videocredit_fa26288c` (`video_id`),
  KEY `videos_videocredit_141c6eec` (`profile_id`),
  CONSTRAINT `profile_id_refs_id_31fc4a6405dffd9f` FOREIGN KEY (`profile_id`) REFERENCES `userprofile_userprofile` (`id`),
  CONSTRAINT `video_id_refs_id_4dcff2eeed362a80` FOREIGN KEY (`video_id`) REFERENCES `videos_video` (`id`)
)

下面是一步一步来说明:

1) 创建 2 个用户:

insert into userprofile_userprofile (id, full_name) values (1, 'John Smith');
insert into userprofile_userprofile (id, full_name) values (2, 'Jane Doe');

2) 用户上传视频。他还没有将任何人——包括他自己——归功于其中。

insert into videos_video (id, title, uploaded_by_id) values (1, 'Hamlet', 1);

结果应该如下:

**User**     **Uploads**  **Credited_by_others**  **Total_credits**
John Smith       1                0                      1
Jane Doe         0                0                      0

3) 上传视频的用户现在将自己的功劳归于视频。请注意,这不应该改变任何事情,因为用户已经收到了上传电影的信用并且我不允许重复信用:

insert into videos_videocredit (id, video_id, profile_id, position) values (1, 1, 1, 'director')

结果现在应该如下:

**User**     **Uploads**  **Credited_by_others**  **Total_credits**
John Smith       1                0                      1
Jane Doe         0                0                      0

4) 用户现在在同一个视频中又两次将自己的功劳归于自己(即,他在视频中有多个“位置”)。此外,他在该视频中三度称赞 Jane Doe:

insert into videos_videocredit (id, video_id, profile_id, position) values (2, 1, 1, 'writer')
insert into videos_videocredit (id, video_id, profile_id, position) values (3, 1, 1, 'producer')
insert into videos_videocredit (id, video_id, profile_id, position) values (4, 1, 2, 'director')
insert into videos_videocredit (id, video_id, profile_id, position) values (5, 1, 2, 'editor')
insert into videos_videocredit (id, video_id, profile_id, position) values (6, 1, 2, 'decorator')

结果现在应该如下:

**User**     **Uploads**  **Credited_by_others**  **Total_credits**
John Smith       1                0                      1
Jane Doe         0                1                      1

5) Jane Doe 现在上传视频。她没有归功于自己,但在视频中两次归功于 John Smith:

insert into videos_video (id, title, uploaded_by_id) values (2, 'Othello', 2)
insert into videos_videocredit (id, video_id, profile_id, position) values (7, 2, 1, 'writer')
insert into videos_videocredit (id, video_id, profile_id, position) values (8, 2, 1, 'producer')

结果现在应该如下:

**User**     **Uploads**  **Credited_by_others**  **Total_credits**
John Smith       1                1                      2
Jane Doe         1                1                      2

所以,我想为每个用户找到这三个字段——UploadsCredited_by_othersTotal_credits。数据永远不应为 Null,而应在字段没有计数时为 0。谢谢。

【问题讨论】:

  • (5)下,应该是(7, 2, 1)(8, 2, 1)吗?
  • 再次澄清一下,简仅因上传而获得功劳,对吗?她不需要在自己上传的内容中明确注明出处吗?
  • 正确。上传视频会自动将功劳归功于上传视频的用户。因此,如果有人上传了一个视频,无论他们在该视频中获得 0 次还是 1000 次的信用,他们仍然会获得 1 次信用。 (附带说明,只有参与制作视频的个人才能上传视频,例如,视频不会由不参与视频的随机个人上传。)

标签: mysql sql count aggregate


【解决方案1】:

我使用连接重写了查询,因此服务器更容易优化。

前两个视图用于简化查询

CREATE VIEW IF NOT EXISTS vperson_videos AS
    SELECT
        v.uploaded_by_id AS id,
        COUNT(*) AS uploads
    FROM vvideo v
    GROUP BY v.uploaded_by_id;

上面的视图只是统计用户上传的视频数量。

CREATE VIEW vperson_credits AS
    SELECT
        c.profile_id AS id,
        COUNT(DISTINCT c.video_id) AS credits
    FROM vcredit c
    INNER JOIN vvideo cv ON cv.id = c.video_id
    WHERE cv.uploaded_by_id <> c.profile_id
    GROUP BY c.profile_id;

上面的视图计算了归功于用户的(不同的)视频的数量,但忽略了用户自己上传的视频。

然后是查询本身:

SELECT
    p.id,
    p.full_name,
    IFNULL(pv.uploads,0) AS uploads,
    IFNULL(pc.credits,0) AS credits,
    IFNULL(pv.uploads,0) + IFNULL(pc.credits,0) AS total_credits
FROM vperson p
LEFT OUTER JOIN vperson_videos pv ON pv.id = p.id
LEFT OUTER JOIN vperson_credits pc ON pc.id = p.id;

我使用LEFT OUTER JOIN 来包含那些没有上传任何视频或没有被记录在任何视频中的用户。 IFNULL() 是必要的,因为我会得到 NULL 而不是 0

最终结果是:

+----+------------+---------+---------+---------------+
| id | full_name  | uploads | credits | total_credits |
+----+------------+---------+---------+---------------+
|  1 | John Smith |       1 |       1 |             2 | 
|  2 | Jane Doe   |       1 |       1 |             2 | 
+----+------------+---------+---------+---------------+

【讨论】:

    【解决方案2】:

    首先,我认为您的问题描述有几个错误。

    • 在第 5 步中,您在视频 2 中描述了 Jane 两次将 John 归功于 John。我认为您只是在 values 子句中的某些列排序错误。应该是:

      insert into videos_videocredit (id, video_id, profile_id, position) values (7, 2, 1, 'writer');
      insert into videos_videocredit (id, video_id, profile_id, position) values (8, 2, 1, 'producer');
      
    • 您的结果应显示 John 在 2 个视频中获得荣誉,而 Jane 在 1 个视频中获得荣誉。

      +------------+---------+--------------------+---------------+
      | full_name  | Uploads | Credited_by_others | Total_credits |
      +------------+---------+--------------------+---------------+
      | John Smith |       1 |                  1 |             2 | 
      | Jane Doe   |       1 |                  1 |             1 | 
      +------------+---------+--------------------+---------------+
      

    我在 MySQL 5.1.57 上测试了以下查询,它给出了上述结果。

    SELECT
      u.full_name,
      COUNT(DISTINCT myvideos.id) AS Uploads,
      COUNT(DISTINCT byothers.id) AS Credited_by_others,
      COUNT(DISTINCT credited.id) AS Total_credits
    FROM userprofile_userprofile AS u
    LEFT OUTER JOIN videos_video AS myvideos ON myvideos.uploaded_by_id = u.id
    LEFT OUTER JOIN (
      videos_videocredit AS c USE INDEX (videocredit_profileid_videoid)
      INNER JOIN videos_video AS credited
        ON c.video_id = credited.id
    ) ON c.profile_id = u.id
    LEFT OUTER JOIN videos_video AS byothers USE INDEX (video_up_id)
      ON c.video_id = byothers.id
      AND byothers.uploaded_by_id <> u.id
    GROUP BY u.id
    

    我创建了几个额外的索引并给出了使用它们的查询提示。

    CREATE INDEX video_up_id ON videos_video (id,uploaded_by_id);
    
    CREATE INDEX videocredit_profileid_videoid ON videos_videocredit (profile_id,video_id);
    

    这样可以保证所有的表(除了userprofile)都是用Using index模式访问的,也就是说只读取索引B树就可以满足查询,不需要读取表数据.这是解释报告:

    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: u
             type: index
    possible_keys: NULL
              key: PRIMARY
          key_len: 4
              ref: NULL
             rows: 2
            Extra: 
    *************************** 2. row ***************************
               id: 1
      select_type: SIMPLE
            table: myvideos
             type: ref
    possible_keys: userprofile_video_e43a31e7
              key: userprofile_video_e43a31e7
          key_len: 4
              ref: test.u.id
             rows: 1
            Extra: Using index
    *************************** 3. row ***************************
               id: 1
      select_type: SIMPLE
            table: c
             type: ref
    possible_keys: videocredit_profileid_videoid
              key: videocredit_profileid_videoid
          key_len: 5
              ref: test.u.id
             rows: 1
            Extra: Using index
    *************************** 4. row ***************************
               id: 1
      select_type: SIMPLE
            table: credited
             type: eq_ref
    possible_keys: PRIMARY,video_up_id
              key: PRIMARY
          key_len: 4
              ref: test.c.video_id
             rows: 1
            Extra: Using index
    *************************** 5. row ***************************
               id: 1
      select_type: SIMPLE
            table: byothers
             type: ref
    possible_keys: video_up_id
              key: video_up_id
          key_len: 4
              ref: test.c.video_id
             rows: 1
            Extra: Using index
    5 rows in set (0.00 sec)
    

    在针对少量行进行测试时,优化可以提供可变报告。因此,在针对真实数据集进行测试时,我们可能会看到不同的结果,然后可能没有必要给出USE INDEX 提示。


    但是,尽管有上述解决方案,我还是希望在单独的查询中完成每项任务。在一个查询中完成所有事情对于开发和测试来说是复杂的,并且对于 RDBMS 来说执行起来通常很昂贵。如果您需要添加另一个计数,它将更加复杂。

    SELECT
      u.full_name,
      COUNT(DISTINCT myvideos.id) AS Uploads
    FROM userprofile_userprofile AS u
    LEFT OUTER JOIN videos_video AS myvideos ON myvideos.uploaded_by_id = u.id
    GROUP BY u.id;
    
    SELECT
      u.full_name,
      COUNT(DISTINCT byothers.id) AS Credited_by_others
    FROM userprofile_userprofile AS u
    LEFT OUTER JOIN videos_videocredit AS c
      USE INDEX (videocredit_profileid_videoid)
      ON c.profile_id = u.id
    LEFT OUTER JOIN videos_video AS byothers
      USE INDEX (video_up_id)
      ON c.video_id = byothers.id AND byothers.uploaded_by_id <> u.id
    GROUP BY u.id;
    
    SELECT
      u.full_name,
      COUNT(DISTINCT credited.id) AS Total_credits
    FROM userprofile_userprofile AS u
    LEFT OUTER JOIN (
      videos_videocredit AS c
      USE INDEX (videocredit_profileid_videoid)
      INNER JOIN videos_video AS credited
        ON c.video_id = credited.id
    ) ON c.profile_id = u.id
    GROUP BY u.id;
    

    【讨论】:

    • 上传视频算作上传者的视频积分,因此当简上传第二个视频时,她会获得第二个积分,她的最终积分数将为 2。
    • 只需阅读并测试答案。哇,令人难以置信的有用和描述性的答案-谢谢!唯一的问题是之前评论中提到的问题,其中上传视频算作功劳(即使用户尚未在该视频中计入功劳)。
    【解决方案3】:

    总信用只是上传信用与国外信用的总和。由于上传信用很容易,这里只是外国信用。屏住呼吸进行两次子查询。

    SELECT profile_id, COUNT(video_id) AS foreign_credit
           FROM (SELECT DISTINCT profile_id, video_id FROM videos_videocredit
                 WHERE (profile_id, video_id) NOT IN (SELECT uploaded_by_id, id FROM videos_video)) AS crsq
    GROUP BY profile_id;
    

    这在视图中变得更加明显。我们制作了一个视图,只选择(profile_id, video_id) 的人在他们自己没有上传的视频中的功劳。我们将视图称为vfcredits

    CREATE VIEW vfcredits AS
      SELECT DISTINCT profile_id, video_id FROM videos_credit
      WHERE (profile_id, video_id) NOT IN (SELECT uploaded_by_id, id FROM videos_video);
    

    现在我们可以愉快地将其粘贴到汇总外国信用的主查询中:

    SELECT profile_id, COUNT(video_id) AS foreign_credit
    FROM vfcredits
    GROUP BY profile_id;
    

    现在让我们把它们放在一起。我们再提出两种观点,一种是计算自己的学分,一种是计算国外的学分:

    CREATE VIEW vowncount AS
      SELECT uploaded_by_id AS profile_id, COUNT(*) AS own_credits
      FROM videos_video
      GROUP BY uploaded_by_id;
    
    CREATE VIEW vforeigncount AS
      SELECT profile_id, COUNT(video_id) AS foreign_credits
      FROM vfcredits
      GROUP BY profile_id;
    

    最后,完整的选择:

    SELECT name,
           own_credits,
           foreign_credits,
           own_credits + foreign_credits AS total_credits
    FROM userprofile_userprofile
    JOIN vowncount ON(userprofile_userprofile.id = vowncount.profile_id)
    JOIN vforeigncount ON(userprofile_userprofile.id = vforeigncount.profile_id);
    

    【讨论】:

    • (感谢您的编辑,在我的测试设置中,我有表格“vperson”、“vuploads”和“vcredits”——我发现“userprofile_userprofile”在手指和编辑器窗口上很小但很难: -))
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2014-11-28
    • 2017-10-16
    • 2017-09-28
    • 2020-01-30
    • 1970-01-01
    • 2019-12-23
    • 1970-01-01
    相关资源
    最近更新 更多