【问题标题】:Find the longest streak of perfect scores per player找到每个球员最长的连续得分
【发布时间】:2019-10-28 05:09:30
【问题描述】:

我在 PostgreSQL 数据库中使用 ORDER BY player_id ASC, time ASC 查询 SELECT 得到以下结果:

player_id  points  time

395        0       2018-06-01 17:55:23.982413-04
395        100     2018-06-30 11:05:21.8679-04
395        0       2018-07-15 21:56:25.420837-04
395        100     2018-07-28 19:47:13.84652-04
395        0       2018-11-27 17:09:59.384-05
395        100     2018-12-02 08:56:06.83033-05
399        0       2018-05-15 15:28:22.782945-04
399        100     2018-06-10 12:11:18.041521-04
454        0       2018-07-10 18:53:24.236363-04
675        0       2018-08-07 20:59:15.510936-04
696        0       2018-08-07 19:09:07.126876-04
756        100     2018-08-15 08:21:11.300871-04
756        100     2018-08-15 16:43:08.698862-04
756        0       2018-08-15 17:22:49.755721-04
756        100     2018-10-07 15:30:49.27374-04
756        0       2018-10-07 15:35:00.975252-04
756        0       2018-11-27 19:04:06.456982-05
756        100     2018-12-02 19:24:20.880022-05
756        100     2018-12-04 19:57:48.961111-05

我试图找到每个玩家最长的连胜纪录points = 100,决胜局是最近开始的连胜纪录。我还需要确定该球员最长连胜纪录的开始时间。预期的结果是:

player_id  longest_streak  time_began

395        1               2018-12-02 08:56:06.83033-05
399        1               2018-06-10 12:11:18.041521-04
756        2               2018-12-02 19:24:20.880022-05

【问题讨论】:

标签: sql postgresql greatest-n-per-group window-functions gaps-and-islands


【解决方案1】:

这是一个gap and island问题,你可以尝试使用SUM条件加重函数和窗口函数,得到gap number。

然后再次使用MAXCOUNT窗口函数。

查询 1

WITH CTE AS (
    SELECT *,
           SUM(CASE WHEN points = 100 THEN 1 END) OVER(PARTITION BY player_id ORDER BY time) - 
           SUM(1) OVER(ORDER BY time) RN
    FROM T
)
SELECT player_id,
       MAX(longest_streak) longest_streak,
       MAX(cnt) longest_streak 
FROM (
  SELECT player_id,
         MAX(time) OVER(PARTITION BY rn,player_id) longest_streak, 
         COUNT(*) OVER(PARTITION BY rn,player_id)  cnt
  FROM CTE 
  WHERE points > 0
) t1
GROUP BY player_id

Results

| player_id |              longest_streak | longest_streak |
|-----------|-----------------------------|----------------|
|       756 | 2018-12-04T19:57:48.961111Z |              2 |
|       399 | 2018-06-10T12:11:18.041521Z |              1 |
|       395 |  2018-12-02T08:56:06.83033Z |              1 |

【讨论】:

    【解决方案2】:

    执行此操作的一种方法是查看上一个和下一个非 100 结果之间的行数。要获得条纹的长度:

    with s as (
          select s.*,
                 row_number() over (partition by player_id order by time) as seqnum,
                 count(*) over (partition by player_id) as cnt          
          from scores s
         )
    select s.*,
           coalesce(next_seqnum, cnt + 1) - coalesce(prev_seqnum, 0) - 1 as length
    from (select s.*,
                 max(seqnum) filter (where score <> 100) over (partition by player_id order by time) as prev_seqnum,
                 max(seqnum) filter (where score <> 100) over (partition by player_id order by time) as next_seqnum
          from s
         ) s
    where score = 100;
    

    然后您可以合并其他条件:

    with s as (
          select s.*,
                 row_number() over (partition by player_id order by time) as seqnum,
                 count(*) over (partition by player_id) as cnt          
          from scores s
         ),
         streaks as (
          select s.*,
                 coalesce(next_seqnum - prev_seqnum) over (partition by player_id) as length,
                 max(next_seqnum - prev_seqnum) over (partition by player_id) as max_length,
                 max(next_seqnum) over (partition by player_id) as max_next_seqnum
          from (select s.*,
                       coalesce(max(seqnum) filter (where score <> 100) over (partition by player_id order by time), 0) as prev_seqnum,
                       coalesce(max(seqnum) filter (where score <> 100) over (partition by player_id order by time), cnt + 1) as next_seqnum
                from s
               ) s
          where score = 100
         )
    select s.*
    from streaks s
    where length = max_length and
          next_seqnum = max_next_seqnum;
    

    【讨论】:

      【解决方案3】:

      确实是 问题。

      假设:

      • “连胜”不会被其他玩家的行打断。
      • 所有列都定义为NOT NULL。 (否则你必须做更多。)

      这应该是最简单最快的,因为它只需要两个快速row_number() window functions

      SELECT DISTINCT ON (player_id)
             player_id, count(*) AS seq_len, min(ts) AS time_began
      FROM  (
         SELECT player_id, points, ts
              , row_number() OVER (PARTITION BY player_id ORDER BY ts) 
              - row_number() OVER (PARTITION BY player_id, points ORDER BY ts) AS grp
         FROM   tbl
         ) sub
      WHERE  points = 100
      GROUP  BY player_id, grp  -- omit "points" after WHERE points = 100
      ORDER  BY player_id, seq_len DESC, time_began DESC;
      

      db小提琴here

      使用列名ts 代替time,这是标准SQL 中的reserved word。它在 Postgres 中是允许的,但有一些限制,将其用作标识符仍然是个坏主意。

      “诀窍”是减去行号,以便每个(player_id, points) 的连续行属于同一组 (grp)。 然后过滤得分为 100 分的人,按组汇总并仅返回每个玩家最长、最近的结果。
      该技术的基本解释:

      我们可以在同一个SELECT 中使用GROUP BYDISTINCT ONGROUP BY 被应用之前 DISTINCT ON。考虑SELECT 查询中的事件顺序:

      关于DISTINCT ON

      【讨论】:

        【解决方案4】:

        这是我的答案

        select 
        user_id,
        non_streak,
        streak,
        ifnull(non_streak,streak) strk,
        max(time) time
        from (
        
        Select
        user_id,time,
        points,
        lag(points) over (partition by user_id order by time) prev_point,
        case when points + lag(points) over (partition by user_id order by time) = 100  then 1 end as non_streak,
        case when points + lag(points) over (partition by user_id order by time) > 100  then 1 end as streak
        
        
        From players
        ) where ifnull(non_streak,streak) is not null
        group by 1,2,3
        order by 1,2 
        ) group by user_id`
        

        【讨论】:

        猜你喜欢
        • 2022-01-20
        • 2020-12-08
        • 1970-01-01
        • 2020-02-11
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-03-24
        • 1970-01-01
        相关资源
        最近更新 更多