【问题标题】:Split dataset into group based on date+count [MSSQL 2012]根据日期+计数将数据集分组 [MSSQL 2012]
【发布时间】:2016-11-02 16:27:43
【问题描述】:

我在 SQL Server 2012 中有一个数据集,需要遵循以下原则:

“席位”允许在第一个结果后的 6 个月内获得 8 个结果。如果从第一个结果之日起的 6 个月内有超过 8 个结果,则会给出一个新的“座位”。

如果一个结果是从第一个结果之日起 6 个月内创建的,那么会给出一个新的“席位”

所以我有以下数据:

User    DateCreated
----    -------------
User1   2015-01-01 16:05:00
User1   2015-01-02 16:05:00
User1   2015-01-03 16:05:00
User1   2015-01-04 16:05:00
User1   2015-01-05 16:05:00
User1   2015-01-06 16:05:00
User1   2015-01-07 16:05:00
User1   2015-01-08 16:05:00
User1   2015-01-09 16:05:00
User1   2015-01-10 13:25:00
User1   2015-01-11 13:25:00
User1   2015-01-12 13:25:00
User1   2015-09-01 13:00:00 
User1   2016-04-01 13:00:00
User2   2015-01-01 13:25:00 
User2   2015-01-02 13:25:00 
User2   2015-09-01 13:25:00 
User2   2016-01-01 13:25:00 
User2   2016-05-01 13:25:00 
User3   2015-01-01 16:05:00 
User3   2015-01-02 16:05:00     
User3   2015-01-03 16:05:00     

根据上述规则,可以将它们分成以下“组”

User    DateCreated             Group
----    -------------           -----
User1   2015-01-01 16:05:00     1
User1   2015-01-02 16:05:00     1
User1   2015-01-03 16:05:00     1
User1   2015-01-04 16:05:00     1
User1   2015-01-05 16:05:00     1
User1   2015-01-06 16:05:00     1
User1   2015-01-07 16:05:00     1
User1   2015-01-08 16:05:00     1 /*8 results within 6 months from first row*/

User1   2015-01-09 16:05:00     2
User1   2015-01-10 13:25:00     2
User1   2015-01-11 13:25:00     2
User1   2015-01-12 13:25:00     2

User1   2015-09-01 13:00:00     3 /*created 6+ months after previous row*/

User1   2016-04-01 13:00:00     4 /*created 6+ months after previous row*/
----
User2   2015-01-01 13:25:00     1
User2   2015-01-02 13:25:00     1

User2   2015-09-01 13:25:00     2 /*created 6+ months after previous row*/
User2   2016-01-01 13:25:00     2 

User2   2016-05-01 13:25:00     3 /*created 6+ months after previous row*/
----
User3   2015-01-01 16:05:00     1
User3   2015-01-02 16:05:00     1
User3   2015-01-03 16:05:00     1 /*3 results within 6 months from first row, within the 8 result cut-off */

最后可能会是这样的

User        Seats
----        -----
User1       4
User2       3
User3       1

如果有的话,如何在 SQL 查询中实现这一点?

--

好的,所以评论 https://stackoverflow.com/a/40385655/1283391 到了一半,我已经修改了上面的预期输出以解释差异,因为我的解释不正确。

我认为我不需要SUM,而需要总和

【问题讨论】:

    标签: sql-server sql-server-2012 grouping


    【解决方案1】:

    这是一种方法

    ;WITH cte
         AS (SELECT *,
             ( ( Row_number()OVER(partition BY [User] 
                 ORDER BY [DateCreated]) - 1 ) / 8 ) + 1 AS rn, -- To group 8 records per user
                    Lag([DateCreated])OVER(partition BY [User] ORDER BY [DateCreated])                     AS PREV_DATE
             FROM   Yourtable),
         INTR
         AS (SELECT *,
                    Sum(Datediff(mm, Isnull(PREV_DATE, [DateCreated]), [DateCreated]))
                      OVER(
                        partition BY [User]
                        ORDER BY [DateCreated]) AS GRP -- To group the user based on Date difference 
             FROM   cte)
    SELECT [User],[DateCreated],
           Dense_rank()
             OVER(
               PARTITION BY [User]
               ORDER BY rn, GRP) AS Groups
    FROM   INTR 
    

    样本数据

    CREATE TABLE Yourtable
        ([User] varchar(5), [DateCreated] datetime)
    ;
    
    INSERT INTO Yourtable
        ([User], [DateCreated])
    VALUES
        ('User1', '2015-01-01 16:05:00'),
        ('User1', '2015-01-02 16:05:00'),
        ('User1', '2015-01-03 16:05:00'),
        ('User1', '2015-01-04 16:05:00'),
        ('User1', '2015-01-05 16:05:00'),
        ('User1', '2015-01-06 16:05:00'),
        ('User1', '2015-01-07 16:05:00'),
        ('User1', '2015-01-08 16:05:00'),
        ('User1', '2015-01-09 16:05:00'),
        ('User1', '2015-01-10 13:25:00'),
        ('User1', '2015-01-11 13:25:00'),
        ('User1', '2015-01-12 13:25:00'),
        ('User2', '2015-01-01 13:25:00'),
        ('User2', '2015-01-02 13:25:00'),
        ('User2', '2015-09-01 13:25:00'),
        ('User2', '2016-05-01 13:25:00')
    ;
    

    结果

    +-------+-------------------------+--------+
    | User  |       DateCreated       | Groups |
    +-------+-------------------------+--------+
    | User1 | 2015-01-01 16:05:00.000 |      1 |
    | User1 | 2015-01-02 16:05:00.000 |      1 |
    | User1 | 2015-01-03 16:05:00.000 |      1 |
    | User1 | 2015-01-04 16:05:00.000 |      1 |
    | User1 | 2015-01-05 16:05:00.000 |      1 |
    | User1 | 2015-01-06 16:05:00.000 |      1 |
    | User1 | 2015-01-07 16:05:00.000 |      1 |
    | User1 | 2015-01-08 16:05:00.000 |      1 |
    | User1 | 2015-01-09 16:05:00.000 |      2 |
    | User1 | 2015-01-10 13:25:00.000 |      2 |
    | User1 | 2015-01-11 13:25:00.000 |      2 |
    | User1 | 2015-01-12 13:25:00.000 |      2 |
    | User2 | 2015-01-01 13:25:00.000 |      1 |
    | User2 | 2015-01-02 13:25:00.000 |      1 |
    | User2 | 2015-09-01 13:25:00.000 |      2 |
    | User2 | 2016-05-01 13:25:00.000 |      3 |
    +-------+-------------------------+--------+
    

    寻找最终结果

    Select [User],Max(Groups) as Seats
    From INTR 
    Group by [User]
    

    【讨论】:

    • 这真是令人印象深刻,正是我需要使用的。如果我要按日期过滤,我可以在 FROM Yourtable 之后这样做对吗?
    • 抱歉,但我不确定日期差异分组是否有效。如果您将 User2 - 2016-05-01 13:25:00.000 更改为 User2 - 2016-01-01 13:25:00.000 那么它应该在第 2 组中,因为它在 6 个月内。目前它显示为 3。这有意义吗?如果当前行日期比前一行日期晚 6 个月,则它是一个新组。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-07-28
    • 1970-01-01
    • 2021-08-10
    • 2018-02-24
    • 1970-01-01
    相关资源
    最近更新 更多