【问题标题】:Merging rows in SQL for similar IDs?合并 SQL 中相似 ID 的行?
【发布时间】:2011-10-09 14:51:41
【问题描述】:

我现在遇到了一个有趣的困境。我有一个如下的数据库架构:

GameList:
+-------+----------+-----------+------------+--------------------------------+
|  id   | steam_id | origin_id | impulse_id |           game_title           |
+-------+----------+-----------+------------+--------------------------------+
|   1   |   17450  |   NULL    |    NULL    |      Dragon Age: Origins       |
|   2   |   NULL   | 138994900 |    NULL    |    Dragon Age(TM): Origins     |
|   3   |   NULL   |   NULL    |  dragonage |      Dragon Age Origins        |
|   4   |   47850  | 201841300 |  fifamgr11 |        FIFA Manager 11         |
|  ...  |   ...    |    ...    |     ...    |              ...               |
+-------+----------+-----------+------------+--------------------------------+

GameAlias:
+----------+-----------+
|  old_id  |  new_id   |
+----------+-----------+
|    2     |     1     |
|    3     |     1     |
|   ...    |    ...    |
+----------+-----------+

根据商店是否对游戏使用相同的标题,可能没有问题,或者同一游戏可能有多行。 Alias 表的存在就是为了解决这个问题,它声明 id 2 和 id 3 只是 id 1 的别名。

我需要的是一个 SQL 查询,它同时使用 GameList 表和 GameAlias 表并返回以下内容:

ConglomerateGameList:
+-------+----------+-----------+------------+--------------------------------+
|  id   | steam_id | origin_id | impulse_id |           game_title           |
+-------+----------+-----------+------------+--------------------------------+
|   1   |   17450  | 138994900 |  dragonage |      Dragon Age: Origins       |
|   4   |   47850  | 201841300 |  fifamgr11 |        FIFA Manager 11         |
|  ...  |   ...    |    ...    |     ...    |              ...               |
+-------+----------+-----------+------------+--------------------------------+

请注意,我想要“新 id”的游戏标题。任何“旧 id”的游戏标题都应该被丢弃/忽略。

我还想指出,我无法对 GameList 表进行任何修改来解决此问题。如果我只是简单地重写表格以使其看起来像我想要的输出,那么每天晚上当我从商店中获取更新的游戏列表时,它将无法在数据库中找到游戏,并生成另一行,如下所示:

+-------+----------+-----------+------------+--------------------------------+
|  id   | steam_id | origin_id | impulse_id |           game_title           |
+-------+----------+-----------+------------+--------------------------------+
|   1   |   17450  | 138994900 |  dragonage |      Dragon Age: Origins       |
|   4   |   47850  | 201841300 |  fifamgr11 |        FIFA Manager 11         |
|  ...  |   ...    |    ...    |     ...    |              ...               |
|  8139 |   NULL   | 138994900 |    NULL    |     Dragon Age(TM): Origins    |
|  8140 |   NULL   |    NULL   |  dragonage |      Dragon Age Origins        |
+-------+----------+-----------+------------+--------------------------------+

我也无法假设游戏的 id 永远不会改变,因为已知 Steam 会在游戏的主要更新发布时改变它们。

如果它可以识别递归别名,则加分,如下所示:

GameAlias:
+----------+-----------+
|  old_id  |  new_id   |
+----------+-----------+
|    2     |     1     |
|    3     |     2     |
|   ...    |    ...    |
+----------+-----------+

由于 id 3 是 id 2 的别名,它本身就是 id 1 的别名。如果递归别名是不可能的,那么我可以开发我的应用程序逻辑来防止它们。

【问题讨论】:

    标签: mysql sql subquery alias


    【解决方案1】:

    这行得通吗?更正表名。

    select ga1.new_id, max(gl1.steam_id), max(gl1.origin_id), max(gl1.impulse_id),
    max(if(gl1.id = ga1.new_id,gl1.game_title,NULL)) as game_title
    from gl1, ga1
    where (gl1.id = ga1.new_id OR gl1.id = ga1.old_id)
    group by ga1.new_id
    
    union
    
    select gl2.id, gl2.steam_id, gl2.origin_id, gl2.impulse_id, gl2.game_title
    from gl2, ga2
    where (gl2.id not in (
        select ga3.new_id from ga3
        union
        select ga4.old_id from ga4))
    

    【讨论】:

    • 工作就像一个魅力。事实上,我找到了一种进一步简化查询的方法,使用 COALESCE() 从别名表中返回 ID 或原始 ID(如果没有别名的话)。虽然group bymax(gl1.steam_id), max(... 确实帮了我很多。
    【解决方案2】:

    1.第一种解决方案(不递归):

    CREATE TABLE GameList
    (
         id         INT NOT NULL PRIMARY KEY
        ,steam_id   INT NULL
        ,origin_id  INT NULL
        ,impulse_id NVARCHAR(50) NULL            
        ,game_title NVARCHAR(50) NOT NULL
    );
    INSERT  GameList(id, steam_id, origin_id, impulse_id, game_title)
    SELECT  1,  17450,  NULL,       NULL,       'Dragon Age: Origins'
    UNION ALL
    SELECT  2,  NULL,   138994900,  NULL,       'Dragon Age(TM): Origins'
    UNION ALL
    SELECT  3,  NULL,   NULL,       'dragonage','Dragon Age Origins'   
    UNION ALL
    SELECT  4,  47850,  201841300,  'fifamgr11','FIFA Manager 11';
    
    CREATE TABLE GameAlias
    (
        old_id INT NOT NULL PRIMARY KEY
        ,new_id INT NOT NULL
    );
    
    INSERT  GameAlias (old_id, new_id) VALUES (2,1);
    INSERT  GameAlias (old_id, new_id) VALUES (3,1);
    
    -- Solution 1
    SELECT  COALESCE(ga.new_id, gl.id) new_id
            ,MAX(gl.steam_id) new_steam_id
            ,MAX(gl.origin_id) new_origin_id
            ,MAX(gl.impulse_id) new_impulse_id
            ,MAX( CASE WHEN ga.old_id IS NULL THEN gl.game_title ELSE NULL END ) new_game_title
    FROM    GameList gl
    LEFT OUTER JOIN GameAlias ga ON gl.id = ga.old_id
    GROUP BY COALESCE(ga.new_id, gl.id);
    -- End of Solution 1    
    DROP TABLE GameList;
    DROP TABLE GameAlias;
    

    结果:

    1   17450   138994900   dragonage   Dragon Age: Origins
    4   47850   201841300   fifamgr11   FIFA Manager 11
    

    2.第二种解决方案(递归级别=三个级别):

    CREATE TABLE GameList
    (
         id         INT NOT NULL PRIMARY KEY
        ,steam_id   INT NULL
        ,origin_id  INT NULL
        ,impulse_id NVARCHAR(50) NULL            
        ,game_title NVARCHAR(50) NOT NULL
    );
    INSERT  GameList(id, steam_id, origin_id, impulse_id, game_title)
    SELECT  1,  17450,  NULL,       NULL,       'Dragon Age: Origins'
    UNION ALL
    SELECT  2,  NULL,   138994900,  NULL,       'Dragon Age(TM): Origins'
    UNION ALL
    SELECT  3,  NULL,   NULL,       'dragonage','Dragon Age Origins'   
    UNION ALL
    SELECT  4,  47850,  201841300,  'fifamgr11','FIFA Manager 11'
    UNION ALL
    SELECT  5,  11111,  NULL,       NULL,       'Starcraft 1'
    UNION ALL
    SELECT  6,  NULL,   1111111111, NULL,       'Starcraft 1.1'   
    UNION ALL
    SELECT  7,  NULL,   NULL,       NULL,      'Starcraft 1.2'
    UNION ALL
    SELECT  8,  NULL,   NULL,       'sc1',      'Starcraft 1.3';
    
    CREATE TABLE GameAlias
    (
        old_id INT NOT NULL PRIMARY KEY
        ,new_id INT NOT NULL
    );
    
    INSERT  GameAlias (old_id, new_id) VALUES (2,1);
    INSERT  GameAlias (old_id, new_id) VALUES (3,1);
    INSERT  GameAlias (old_id, new_id) VALUES (6,5);
    INSERT  GameAlias (old_id, new_id) VALUES (7,6);
    INSERT  GameAlias (old_id, new_id) VALUES (8,7);
    
    -- Solution 2
    CREATE TEMPORARY TABLE Mappings
    (
        old_id INT NOT NULL PRIMARY KEY
        ,new_id INT NOT NULL
    );
    INSERT  Mappings (old_id, new_id)
    -- first level mapping
    SELECT  ga.old_id, ga.new_id
    FROM    GameAlias ga
    WHERE   ga.new_id NOT IN (SELECT t.old_id FROM GameAlias t)
    -- second level mapping
    UNION ALL
    SELECT  ga.old_id, ga2.new_id
    FROM    GameAlias ga
    INNER JOIN GameAlias ga2 ON ga.new_id = ga2.old_id
    WHERE   ga2.new_id NOT IN (SELECT t.old_id FROM GameAlias t)
    -- third level mapping
    UNION ALL
    SELECT  ga.old_id, ga3.new_id
    FROM    GameAlias ga
    INNER JOIN GameAlias ga2 ON ga.new_id = ga2.old_id
    INNER JOIN GameAlias ga3 ON ga2.new_id = ga3.old_id;
    
    SELECT  COALESCE(ga.new_id, gl.id) new_id
            ,MAX(gl.steam_id) new_steam_id
            ,MAX(gl.origin_id) new_origin_id
            ,MAX(gl.impulse_id) new_impulse_id
            ,MAX( CASE WHEN ga.old_id IS NULL THEN gl.game_title ELSE NULL END ) new_game_title
    FROM    GameList gl
    LEFT OUTER JOIN Mappings ga ON gl.id = ga.old_id
    GROUP BY COALESCE(ga.new_id, gl.id);
    
    DROP TEMPORARY TABLE Mappings;
    -- End of Solution 2
    
    DROP TABLE GameList;
    DROP TABLE GameAlias;
    

    结果:

    1   17450   138994900   dragonage   Dragon Age: Origins
    4   47850   201841300   fifamgr11   FIFA Manager 11
    5   11111   1111111111  sc1         Starcraft 1
    

    抱歉,MySQL 没有递归查询/CTE。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多