【问题标题】:Complicated SQL Query About Joining And Limitting关于加入和限制的复杂SQL查询
【发布时间】:2011-05-31 20:42:18
【问题描述】:
SELECT DISTINCT users.id as expert_id, users.firstname, users.lastname
    , projects.id as project_id, projects.project_title
    , projects.project_budget, projects.created as project_created
FROM USERS
RIGHT JOIN expert_skills ON expert_skills.expert_id = users.id
JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
JOIN projects ON projects.id = project_skills.project_id
WHERE projects.status = 1

此查询为我带来了与用户相关的不同项目,但我想限制每个专家的项目数。例如,我想要与专家相关的项目,但项目数量最多可以为 10 个。我需要对我的查询进行限制。我该怎么做?谢谢你

【问题讨论】:

  • 你可能会发现这个来自 MySQL 论坛的 post 很有帮助。
  • 你应该包括你的表架构,预期的输出格式等等......
  • 您的意思是:您希望查询最多为每个用户显示 10 个项目?还是只显示有 10 个或更少项目的用户?
  • @Lynette 10 每个用户的项目

标签: sql mysql


【解决方案1】:

如果 MySQL 有 row_number,这将非常容易,但它没有。

因此,您需要将 Andomar 的 answer 修改为类似的问题。

 set @num := 0, @projectid:= -1;


 SELECT DISTINCT users.id as expert_id, users.firstname, users.lastname, projects.id as                project_id, projects.project_title, projects.project_budget, projects.created as project_created
 FROM 
     USERS
        RIGHT JOIN expert_skills ON expert_skills.expert_id = users.id
        JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
        JOIN projects ON projects.id = project_skills.project_id
    JOIN
    (select  
            users.id uid,
           project.id pid,
           @num := if(@projectid= projects.id, @num + 1, 1) 
                as row_number,
            @projectid:= project.id
    from    USERS
        RIGHT JOIN expert_skills ON expert_skills.expert_id = users.id
        JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
        JOIN projects ON projects.id = project_skills.project_id
    WHERE projects.status = 1
    order by 
            users.id, project.id desc
    ) as projectNum 
    on users.id = pid
    and projects.id = pid
   where   projectNum.row_number <= 10

请参阅data.stackoverflow.com 中的此查询,了解如何在支持窗口的数据库中执行此操作

【讨论】:

  • 感谢窗口化信息,但我无法处理此查询。它返回此错误:您的 SQL 语法有错误;检查与您的 MySQL 服务器版本相对应的手册,以在第 8 行的“select users.id uid, project.id pids, @num :”附近使用正确的语法
  • 很抱歉忘记了连接并且其中一个表名错误。
  • 我修复了一些错误(project.id 到 projects.id),现在查询有效,但每次都返回零行。
【解决方案2】:

使用 rownum 进行子选择限制应该可以正常工作

SELECT DISTINCT users.id as expert_id, users.firstname, users.lastname, projects.id as project_id, projects.project_title, projects.project_budget, projects.created as project_created
FROM USERS
RIGHT JOIN expert_skills ON expert_skills.expert_id = users.id
JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
JOIN (select * from (select a.project_id, additional_rows from projects) a where rownum < 11) projects_skills ON projects.id = project_skills.project_id
WHERE projects.status = 1

【讨论】:

    【解决方案3】:

    可以结合使用stored proceduresin-memory tables 和使用WHILE 循环的控制流:

    DELIMITER $$
    
    DROP PROCEDURE IF EXISTS `Get10ProjectsPerUserByStatus`$$
    -- 1. DEFINE STORED PROCEDURE
    CREATE PROCEDURE `Get10ProjectsPerUserByStatus`(_status INT)
    BEGIN
        -- 2. DECLARE VARIABLES AND IN-MEMORY TABLES
        DECLARE _id INT;
        DROP TABLE IF EXISTS temp_user;
        DROP TABLE IF EXISTS temp_project_user;
        CREATE TABLE temp_user (id INT) ENGINE=MEMORY;
        CREATE TABLE temp_project_user (p_id INT, u_id INT) ENGINE=MEMORY;
    
        -- 3. ADD ALL USERS AND LOOP BY REMOVING 1 USER AT A TIME
        INSERT INTO temp_user SELECT id FROM users;
        WHILE (SELECT COUNT(*) FROM temp_user) > 0 DO
            SET _id = (SELECT MIN(id) FROM temp_user);
            INSERT INTO temp_project_user
            SELECT ps.project_id, _id
            FROM project_skills ps 
            JOIN expert_skills es ON ps.skill_id = es.skill_id
            WHERE es.expert_id = _id
            LIMIT 10;
            DELETE FROM temp_user WHERE id = _id;
        END WHILE;
    
        -- 4. SELECT FROM IN-MEMORY TABLE AND JOIN TO EXISTING SCHEMA
        SELECT DISTINCT t.u_id AS expert_id,
        u.firstname, 
        u.lastname,
        t.p_id AS project_id, 
        p.project_title,
        p.project_budget, 
        p.created AS project_created
        FROM temp_project_user t
        INNER JOIN users u ON u.id = t.u_id
        INNER JOIN projects p ON p.id = t.p_id;
    
        -- 5. DROP IN-MEMORY TABLES
        DROP TABLE temp_user;
        DROP TABLE temp_project_user;
        END$$
    
    DELIMITER ;
    
    -- 6. CALL STORED PROCEDURE & DROP WHEN FINISHED
    CALL Get10ProjectsPerUserByStatus(1);
    
    DROP PROCEDURE IF EXISTS `Get10ProjectsPerUserByStatus`;
    

    注意: 超出此处的问题,但实际上可以通过使用相同的逻辑将 10 个项目限制作为存储过程的参数:

    -- GET 10 PROJECTS PER USER WITH STATUS = 1
    CALL GetProjectsPerUserByStatus(1, 10) 
    

    为此,您需要另外 1 个 in-memory table 和另一个 WHILE 循环

    【讨论】:

      【解决方案4】:

      这样的事情怎么样。

      SELECT  DISTINCT 
        users.id as expert_id, 
        users.firstname, 
        users.lastname, 
        myProjects.project_id, 
        myProjects.project_title, 
        myProjects.project_budget, 
        myProjects.project_created
      FROM USERS RIGHT JOIN 
        expert_skills ON expert_skills.expert_id = users.id JOIN 
        project_skills ON project_skills.skill_id = expert_skills.skill_id JOIN 
        (SELECT P1.id as project_id, 
          P1.project_title, 
          P1.project_budget, 
          P1.created as project_created  
        FROM projects AS P1
        WHERE   P1.status = 1 AND 
          (SELECT COUNT(*)
          FROM projects AS P2 
          WHERE P2.ID <= P1.ID) <= 10) AS myProjects ON myProjects.id = project_skills.project_id
      

      【讨论】:

        【解决方案5】:

        试试这个:

        SELECT 
          users.id as expert_id, users.firstname,
          projects.id as project_id, projects.project_title,
          count(distinct users.id, ps2.project_id)
        FROM USERS
          JOIN expert_skills ON expert_skills.expert_id = users.id
          JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
          JOIN projects ON projects.id = project_skills.project_id 
          JOIN expert_skills es2 ON es2.expert_id = users.id
          LEFT JOIN project_skills ps2 ON ps2.skill_id = es2.skill_id and ps2.project_id < projects.id
        WHERE projects.status = 1
        group by users.id, users.firstname, projects.id, projects.project_title
        having count(distinct users.id, ps2.project_id) < 10
        order by users.id, projects.id
        

        假设 10 个项目中的任何一个子集都是有效的,否则您将不得不调整标准。

        为了清楚起见,我从选择中删除了一些字段。

        【讨论】:

        • 它不是过滤掉期望超过 10 个项目而不是返回他们的 10 个项目吗?
        • @Schultz999 不,它不会过滤掉拥有超过 10 个项目的用户。请注意有 2 个expert_skills 表和 2 个 project_skills 表这一事实,我使用它们来计算每个用户的项目并将其用作限制因素。诀窍是“ps2.project_id
        • 更明确地说,如果没有 group by / 结果将显示每个 (user, project) 元组重复 N 次,其中 N 是其他 (user, project) 元组的数量project.id 小于主 project.id。当您分组时,count(...) 将以正确的顺序为每个用户显示 0、1、2、3...,并为下一个用户重置。使用 have 子句,计数 > 10 的 (user,project) 元组将被丢弃。
        【解决方案6】:

        我认为你需要的是一个子查询,里面有一个 TOP 10。

        类似这样的东西(我并没有真正得到查询,所以它可能不是你所需要的)

        SELECT users.id as expert_id, users.firstname, users.lastname, projs.*
          FROM Users
               RIGHT JOIN 
               (SELECT DISTINCT TOP 10 projects.id as project_id, projects.project_title, projects.project_budget, projects.created as project_created
               FROM expert_skills 
                    right JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
                    right JOIN projects ON projects.id = project_skills.project_id
              WHERE projects.status = 1 ) projs ON expert_skills.expert_id = users.id
        

        【讨论】:

          【解决方案7】:

          大多数数据库都有一种方法来询问查询的前 X 行。在 DB2 中,您可以使用以下语句返回前 10 行。

          select * from tableName 仅获取前 10 行;

          我相信在 MySql 中你使用了 limit 关键字。

          select * from tableName limit 10;

          如果您只想要少于或等于 10 个项目的人,这将不起作用。我不清楚您是想获得那些拥有 10 个或更少项目的人,还是只想返回前 10 个项目。

          【讨论】:

            【解决方案8】:

            我会从相反的方向开始......首先获得项目,然后找到那些技能和拥有这些技能的人,为每个“专家”添加一个序列计数器。从该结果集中,只需应用 where 子句即可排除“截止”之外的内容。通过使用@mysql 变量,并使用ORDER BY 子句,您可以得到按专家ID 顺序返回的结果,因此在设置它们时,请查看是否为同一专家。如果是这样,将一个添加到序列计数器。如果不同,则设置为零...然后,更新最后一位专家是谁,以便在结果集中的下一条记录上进行比较。

            SELECT
                  PreQuery.expert_id, 
                  PreQuery.firstname, 
                  PreQuery.lastname, 
                  PreQuery.Project_ID,
                  PreQuery.Project_Title,
                  PreQuery.Project_Budget,
                  PreQuery.Project_Created,
                  PreQuery.LastSeq
               from 
                  ( SELECT DISTINCT
                          U.id as expert_id, 
                          U.firstname, 
                          U.lastname, 
                          P.ID as Project_ID,
                          P.Project_Title,
                          P.Project_Budget,
                          P.Created as Project_Created,
                          @Seq := if( @LastExpert = U.ID, @Seq +1, 1 ) LastSeq,
                          @LastExpert := U.ID as IgnoreThis
                       from
                          Projects P
                             JOIN Project_Skills PS
                                ON P.ID = PS.Project_ID
                                JOIN Expert_Skills ES
                                   ON PS.Skill_ID = ES.Skill_ID
                                   JOIN Users U
                                      ON ES.Expert_ID = U.ID,
                          (select @Seq := 0, @LastExpert = 0 ) SQLVars   
                       where
                          P.Status = 1
                       order by 
                          U.ID ) PreQuery
               where
                  PreQuery.LastSeq < 11
            

            【讨论】:

              【解决方案9】:

              这个怎么样...

              SELECT users.id as expert_id, users.firstname, users.lastname, projects.id as project_id, projects.project_title, projects.project_budget, projects.created as project_created
              FROM USERS
              RIGHT JOIN expert_skills ON expert_skills.expert_id = users.id
              JOIN project_skills ON project_skills.skill_id = expert_skills.skill_id
              JOIN projects ON projects.id = project_skills.project_id
              WHERE projects.status = 1
              GROUP BY users.id as expert_id, users.firstname, users.lastname, projects.id as project_id, projects.project_title, projects.project_budget, projects.created
              HAVING COUNT(*) < 10
              

              这是假设您的原始查询完全符合您的要求,但您只想将结果限制在返回的行数少于 10 的地方。

              可能不如这里的其他帖子高效,我可能有错误的一端,但值得我的 2 美分 :)

              【讨论】:

              • 按所有输出字段分组将返回相同的结果,除非有重复记录。问题是为每个专家返回前 10 个项目,而不是创建超过 10 次的相同项目。
              【解决方案10】:

              我建议使用 DENSE_RANK。我没有你的表结构,所以我创建了我自己的、用户和项目表,具有一对多的关系。因此,为用户选择最后 2 个项目的查询是这样的(注意,您可以将其放入存储过程中并使用数字作为输入参数):

              declare @numberOfProjects int
              set @numberOfProjects = 3
              
              ;with topNProjects(userid, projectid, createdtutc, dense_rank)
              as (
                  select p.userid, P.ProjectId, p.CreatedDtUtc, DENSE_RANK() OVER (PARTITION BY P.UserId ORDER BY P.ProjectId) AS DENSE_RANK
                  from DS_Project P
              )
              select userid, projectid, createdtutc from topNProjects
              where dense_rank <= @numberOfProjects
              order by projectid desc
              

              DENSE_RANK 有效地返回分组字段中当前一个 + 1 之前的数字记录。

              已编辑:有人指出这是 MySQL 的问题,而不是 MSSQL。我没有安装 MySQL,但显然还没有(或永远没有)排名。我对排名进行了一些搜索,发现这篇 SO 帖子可能有助于回答原始问题:How to perform grouped ranking in MySQL。因此,一旦添加了排名,逻辑保持不变 - 选择排名低于每个用户寻求的项目数的所有记录。

              【讨论】:

              • 关闭,但没有雪茄,据我所知。他要求 MySQL,这是 MSSQL 可能还有其他
              • @data:是的,你说得对……没想到MySql缺少排名功能。现在我看到了在 06 年打开的错误,并且仍然打开。
              • 如果你对我投了反对票,请解释原因,以免我在回答其他问题时出错。
              猜你喜欢
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 2013-09-01
              • 1970-01-01
              • 1970-01-01
              • 2015-04-29
              相关资源
              最近更新 更多