【问题标题】:How to select a set number of random records where one column is unique?如何选择一组随机记录,其中一列是唯一的?
【发布时间】:2010-10-02 05:13:39
【问题描述】:

我今天一直在努力解决这个 SQL 查询要求,我想知道是否有人可以帮助我。

我有一张运动问题表。其中一列是与问题相关的团队。我的要求是返回一组独特的随机问题。

假设我们有下表并想要 5 个问题:

Question        Answer        Team
-----------------------------------
question 1      answer 1      team A
question 2      answer 2      team B
question 3      answer 3      team B
question 4      answer 3      team D
question 5      answer 3      team A
question 6      answer 3      team C
question 7      answer 3      team F
question 8      answer 3      team C
question 9      answer 3      team G
question 10     answer 3      team D

会返回一个有效的结果:

question 1      answer 1      team A
question 2      answer 2      team B
question 4      answer 3      team D
question 6      answer 3      team C
question 7      answer 3      team F

我觉得应该可以通过巧妙地使用 Distinct 和 Take 将其作为一个干净的 SQL 语句来完成,但我还不能做到这一点。

目前最好的解决方案来自Mladen Prajdic。我刚刚稍微更新了它以改善它的随机性:

SELECT TOP 10 * 
FROM    (SELECT ROW_NUMBER() OVER(PARTITION BY Team ORDER BY Team, NEWID()) AS RN, *
    FROM Question
    ) teams
WHERE   RN = 2
ORDER BY NEWID()

【问题讨论】:

    标签: sql random


    【解决方案1】:

    对于 sql 2005,您可以这样做:

    select top 5 * 
    from    (
                select ROW_NUMBER() over(partition by team order by team) as RN, *
                from @t 
            ) t
    where RN = 1
    order by NEWID()
    

    【讨论】:

    • 谢谢..以前从未使用过 PARTITION 关键字。学到了一些新东西。我稍微更新了查询以改善随机性。
    • 酷。我从来没有想过你可以按 row_number 的顺序排列 newid()。所以我也学到了一些新东西:)
    【解决方案2】:

    这应该可以满足您的需求,在 oracle 中;对于不同的数据库,您显然需要使用他们的随机数源。可能有更好的方法;让我们希望其他人会向我们指出:p

    select question, answer, team
    from
    (
    select question, answer, team, r
    from
    (
    select 
        question, 
        answer, 
        team,
        rank() over (partition by team order by dbms_random.value) r 
    from questions
    )
    where r = 1
    order by dbms_random.value
    ) where rownum<=5;
    

    测试代码:

    create table questions(question varchar2(16), answer varchar2(16), team varchar2(16));
    
    insert into questions(question, answer, team)
    values ('question 1',      'answer 1',      'team A');
    
    insert into questions(question, answer, team)
    values ('question 2',      'answer 2',      'team B');
    
    insert into questions(question, answer, team)
    values ('question 3',      'answer 3',      'team B');
    
    insert into questions(question, answer, team)
    values ('question 4',      'answer 3',      'team D');
    
    insert into questions(question, answer, team)
    values ('question 5',      'answer 3',      'team A');
    
    insert into questions(question, answer, team)
    values ('question 6',      'answer 3',      'team C');
    
    insert into questions(question, answer, team)
    values ('question 7',      'answer 3',      'team F');
    
    insert into questions(question, answer, team)
    values ('question 8',      'answer 3',      'team C');
    
    insert into questions(question, answer, team)
    values ('question 9',      'answer 3',      'team G');
    
    insert into questions(question, answer, team)
    values ('question 10',    'answer 3',      'team D');
    
    commit;
    

    【讨论】:

      【解决方案3】:

      在 PostgreSQL(有不同的 on)中,我可能会做这样的事情:

      select distinct on (Team) Question, Answer, Team from test order by Team, random() limit 5;
      

      刚刚测试过。似乎有效。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2015-10-15
        • 1970-01-01
        • 2012-03-27
        • 2020-08-31
        • 2014-12-13
        • 2014-08-04
        • 1970-01-01
        • 2016-05-06
        相关资源
        最近更新 更多