【问题标题】:T-SQL: Selecting every instance of min and max valuesT-SQL:选择最小值和最大值的每个实例
【发布时间】:2021-04-25 02:26:07
【问题描述】:

我有一张名为#TimeAtHome 的表。它包括一个addressdate 和一个标志atHome,以指示该人当天是否在家。我需要为每个address 的人不在家(0)的每个分组捕获minmax date

这里是一些示例代码:

create table #TimeAtHome (
    [address] varchar(100),
    [date] date,
    [atHome] bit
)

insert into #TimeAtHome
values ('123 ABC Street', '2020-01-01', '1'),
       ('123 ABC Street', '2020-01-02', '1'),
       ('123 ABC Street', '2020-01-03', '0'),
       ('123 ABC Street', '2020-01-04', '0'),
       ('123 ABC Street', '2020-01-05', '0'),
       ('123 ABC Street', '2020-01-06', '0'),
       ('123 ABC Street', '2020-01-07', '1'),
       ('123 ABC Street', '2020-01-08', '0'),
       ('123 ABC Street', '2020-01-09', '0'),
       ('123 ABC Street', '2020-01-10', '1'),
       ('777 Hello Ct', '2020-01-01', '1'),
       ('777 Hello Ct', '2020-01-02', '1'),
       ('777 Hello Ct', '2020-01-03', '1'),
       ('777 Hello Ct', '2020-01-04', '0'),
       ('777 Hello Ct', '2020-01-05', '1'),
       ('777 Hello Ct', '2020-01-06', '1')

这是我想要的结果:

【问题讨论】:

    标签: tsql max ssms min


    【解决方案1】:

    我想我使用了一个更简单的解决方案,因为这似乎是一个间隙和孤岛问题。因此,我使用 LAG() 函数根据 AtHome 标志查找岛屿的起点和终点。然后我使用 SUM() 函数创建一个组并从那里聚合日期:

    SELECT Address,Min(Date) minDate, Max(date) maxDate
    FROM
    (
        SELECT *, SUM(CASE WHEN AtHome <> PrevAtHome THEN 1 ELSE 0 END) OVER(PARTITION BY Address order by date) Grp
        FROM(
          SELECT *, LAG(ATHome,1,AtHome) OVER(PARTITION BY address order by date) PrevAtHome
          from #TimeAtHome
          ) T
    ) Final
    WHERE Athome = 0
    GROUP BY Address,Grp
    ORDER BY Address
    

    【讨论】:

      【解决方案2】:

      我们可以尝试以下方法:

      1. 获取所有 minDate 值与其自身加入表,并检查此人当前日期是否在家,下一个日期是否在家(将是子查询 1)。
      2. 获取所有 maxDate 的方法与第 1 点相同,只是检查此人是否在下一个日期回来(子查询 2)。
      3. 首先匹配每个地址 minDate 与第一个 maxDate,第二个 minDate 与第二个 maxDate 匹配,依此类推(加入子查询 1 和 2)。
      SELECT q1.address,
          q1.minDate,
          q2.maxDate
      FROM (
              SELECT ROW_NUMBER() OVER(
                      PARTITION BY t2.address
                      ORDER BY t2.date
                  ) as row,
                  t2.address,
                  t2.date as minDate
              FROM #TimeAtHome t1 inner join #TimeAtHome t2 ON t1.address = t2.address and t1.date = DATEADD(DAY, -1, t2.date)
              WHERE t1.atHome = 1
                  AND t2.atHome = 0
          ) q1
          INNER JOIN (
              SELECT ROW_NUMBER() OVER(
                      PARTITION BY t1.address
                      ORDER BY t1.date
                  ) as row,
                  t1.address,
                  t1.date as maxDate
              FROM #TimeAtHome t1 INNER JOIN #TimeAtHome t2 ON t1.address = t2.address and t1.date = DATEADD(DAY, -1, t2.date)
              WHERE t1.atHome = 0
                  AND t2.atHome = 1
          ) q2 ON q1.address = q2.address
          AND q1.row = q2.row
      

      请注意此查询的限制

      1. 表中的日期应该是连续的,因此要找到表中的下一条记录,我们只需减去一天t1.date = DATEADD(DAY, -1, t2.date)
      2. 此人从家里开始,因此他外出时的第一个 minDate 与他回来时的第一个 maxDate 匹配。

      【讨论】:

        【解决方案3】:

        查询是这样的,Cte1 用于获取将在下一步中使用的数据的完整视图。 Cte2用于查找mindate,Cte3用于获取maxDate,Rank func用于最后加入

        ;WITH cte1
        AS
        (
            SELECT *, 
                LEAD(date) OVER (PARTITION BY address ORDER BY date) AS nextDate, 
                LEAD(atHome) OVER (PARTITION BY address ORDER  BY date) AS NextAtHome 
            FROM #TimeAtHome
            --ORDER BY address, date
        ),
        CTE2 AS
        (
            SELECT 
                address, 
                cte1.nextDate AS minDate,
                ROW_NUMBER() OVER (ORDER BY cte1.address , cte1.date) AS R1
            FROM cte1 
            WHERE cte1.atHome = 1 AND cte1.NextAtHome = 0
        ),
        CTE3 AS
        (
            SELECT 
                address, 
                date AS maxDate,
                ROW_NUMBER() OVER (ORDER BY cte1.address, cte1.date) AS R2
            FROM cte1 
            WHERE cte1.atHome = 0 AND cte1.NextAtHome = 1
        )
        SELECT CTE2.address,CTE2.minDate,CTE3.maxDate
        FROM cte2
        INNER JOIN cte3 ON cte2.R1 = Cte3.R2
        

        【讨论】:

          【解决方案4】:

          还有一种可能:

          create table #TimeAtHome (
              [address] varchar(100),
              [date] date,
              [atHome] bit
          )
          
          insert into #TimeAtHome
          values ('123 ABC Street', '2020-01-01', '1'),
                 ('123 ABC Street', '2020-01-02', '1'),
                 ('123 ABC Street', '2020-01-03', '0'),
                 ('123 ABC Street', '2020-01-04', '0'),
                 ('123 ABC Street', '2020-01-05', '0'),
                 ('123 ABC Street', '2020-01-06', '0'),
                 ('123 ABC Street', '2020-01-07', '1'),
                 ('123 ABC Street', '2020-01-08', '0'),
                 ('123 ABC Street', '2020-01-09', '0'),
                 ('123 ABC Street', '2020-01-10', '1'),
                 ('777 Hello Ct', '2020-01-01', '1'),
                 ('777 Hello Ct', '2020-01-02', '1'),
                 ('777 Hello Ct', '2020-01-03', '1'),
                 ('777 Hello Ct', '2020-01-04', '0'),
                 ('777 Hello Ct', '2020-01-05', '1'),
                 ('777 Hello Ct', '2020-01-06', '1')
             
          
          
          
          
          SELECT dt.address,
                 MIN(dt.Dt) AS minDate,
                 MAX(dt.Dt) AS maxDate
          FROM (
                  SELECT address, 
                         t.Date AS Dt,
                         DATEDIFF(D, ROW_NUMBER() OVER(partition by t.address ORDER BY t.Date), 
          t.Date) AS DtRange
                  FROM #TimeAtHome t
                  WHERE t.atHome = 0
              ) AS dt
          GROUP BY dt.address, dt.DtRange
          ORDER BY address, minDate;
          

          【讨论】:

            【解决方案5】:

            这是另一种方式

                SELECT
                *
            FROM 
            (
                SELECT
                    *
                    ,ROW_NUMBER() OVER(PARTITION BY address order by date) PrevAtHome_A
                    ,ROW_NUMBER() OVER(PARTITION BY address order by date DESC) PrevAtHome_D
                from #TimeAtHome
                WHERE AtHome = 0
            )A
            WHERE PrevAtHome_A =1 OR PrevAtHome_D =1
            ORDER BY [address], [date]
            

            【讨论】:

              猜你喜欢
              • 2020-05-16
              • 2020-08-19
              • 2015-02-03
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 2019-02-09
              • 2014-12-01
              • 2022-11-13
              相关资源
              最近更新 更多