【问题标题】:How to get every 5 minute interval record?如何获取每 5 分钟的间隔记录?
【发布时间】:2019-01-02 06:51:05
【问题描述】:

我目前的数据

sr_no   PROJECT_ID  PHASE   VACUUM      HUMIDITY    TEMPERATURE     CR_DATE
1       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:38:26.477
2       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:39:26.430
3       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:40:26.447
4       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:41:26.437
5       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:42:33.280
6       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:43:33.267
7       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:44:33.157
8       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:45:33.320
9       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:46:33.293
10      3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:47:33.290
11      3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:48:33.330
12      3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:49:33.350
13      3ewd        0       56.0000     6.0000      12.0000         2018-12-18 16:00:17.883
14      3ewd        2       56.0000     6.0000      12.0000         2018-12-18 16:01:17.757
15      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:02:17.760
16      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:03:17.793
17      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:04:18.123
18      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:05:17.843
19      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:06:17.767
20      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:07:17.887
21      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:08:17.820
22      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:09:17.767
23      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:10:17.800
24      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:11:17.800
25      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:12:17.773
26      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:13:17.797
27      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:14:17.757
28      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:15:17.757
29      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:16:17.770
30      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:17:17.803

我想得到这样的5分钟间隔记录

sr_no   PROJECT_ID  PHASE   VACUUM      HUMIDITY    TEMPERATURE     CR_DATE
1       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:38:26.477
6       3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:43:33.267
12      3QA12352    0       3.0000      45.0000     55.0000         2018-12-18 09:48:33.350
13      3ewd        0       56.0000     6.0000      12.0000         2018-12-18 16:00:17.883
18      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:05:17.843
24      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:10:17.800
25      3ewd        2       56.0000     60.0000     56.0000         2018-12-18 16:15:17.773

我该怎么做?

【问题讨论】:

  • “获取 5 分钟间隔记录”到底是什么意思?您的输入和预期结果是什么样的?
  • @SalmanA 更新我的问题
  • 嗨,Genish,为什么 9:43 和 9:44 记录变成了项目 3QA12352 的 5 分钟记录
  • @sree 抱歉,请检查更新的输出
  • 您的预期输出存在一些差异(例如,输入中不存在 2018-12-18 09:48:33.350)。

标签: sql sql-server tsql datetime


【解决方案1】:

您可以使用递归 CTE 为每个项目生成 5 分钟的日期时间间隔,然后根据这些间隔加入您的度量并检索每个组中的最小度量(CR_DATE)。

在以下示例中,我假设 sr_noPRIMARY KEY(或 UNIQUE)以及 INTIDENTITY 或始终递增的数字。我也忽略了表中的所有其他列,因为它们在这个特定问题上并不重要(您可以选择最终 SELECT 中需要的任何一个)。

数据样本:

IF OBJECT_ID('tempdb..#Measures') IS NOT NULL
    DROP TABLE #Measures

CREATE TABLE #Measures (
    sr_no INT IDENTITY,
    PROJECT_ID VARCHAR(100),
    CR_DATE DATETIME)

INSERT INTO #Measures (
    PROJECT_ID,
    CR_DATE)
VALUES
    ('A', '2018-01-01 00:02:26.112'),
    ('A', '2018-01-01 00:03:26.112'),
    ('A', '2018-01-01 00:07:26.014'),
    ('A', '2018-01-01 00:12:26.112'),
    ('A', '2018-01-01 00:23:43.112'),
    ('A', '2018-01-01 00:26:26.112'),

    ('B', '2018-11-26 00:01:26.112'),
    ('B', '2018-11-25 23:59:00.000'),
    ('B', '2018-11-26 05:02:26.112')

建议的解决方案:

DECLARE @IntervalMinutes INT = 5

;WITH MaxDateMeasuresByProject AS
(
    SELECT
        PROJECT_ID = M.PROJECT_ID,
        MaxCR_DATE = MAX(M.CR_DATE)
    FROM
        #Measures AS M
    GROUP BY
        M.PROJECT_ID
),
 RecursiveIntervals AS
(
    -- Anchor (minimum CR_DATE by PROJECT_ID)
    SELECT
        PROJECT_ID = M.PROJECT_ID,
        IntervalStart = MIN(M.CR_DATE),
        IntervalEnd = DATEADD(MINUTE, @IntervalMinutes, MIN(M.CR_DATE)),
        RecursiveLevel = 1
    FROM
        #Measures AS M 
    GROUP BY
        M.PROJECT_ID

    UNION ALL

    -- Recursion (minutes added to each project interval, until max available measure)
    SELECT
        PROJECT_ID = R.PROJECT_ID,
        IntervalStart = R.IntervalEnd,
        IntervalEnd = DATEADD(MINUTE, @IntervalMinutes, R.IntervalEnd),
        RecursiveLevel = R.RecursiveLevel + 1
    FROM
        RecursiveIntervals AS R
        INNER JOIN MaxDateMeasuresByProject AS M ON R.PROJECT_ID = M.PROJECT_ID
    WHERE
        R.IntervalEnd <= M.MaxCR_DATE
),
FirstMeasureByIntervalByProject AS
(
    SELECT
        PROJECT_ID = M.PROJECT_ID,
        RecursiveLevel = R.RecursiveLevel,
        FirstMeasureSR_NO = MIN(M.sr_no)
    FROM
        #Measures AS M
        INNER JOIN RecursiveIntervals AS R ON 
            M.PROJECT_ID = R.PROJECT_ID AND
            M.CR_DATE >= R.IntervalStart AND
            M.CR_DATE < R.IntervalEnd
    GROUP BY
        M.PROJECT_ID,
        R.RecursiveLevel
)
SELECT
    M.*
FROM
    FirstMeasureByIntervalByProject AS F
    INNER JOIN #Measures AS M ON F.FirstMeasureSR_NO = M.sr_no
ORDER BY
    M.PROJECT_ID,
    M.CR_DATE
OPTION
    (MAXRECURSION 0)

中间步骤RecursiveIntervals 的结果如下所示(每个项目的最小和最大度量之间的间隔为 5 分钟):

PROJECT_ID  IntervalStart               IntervalEnd                 RecursiveLevel
A           2018-01-01 00:02:26.113     2018-01-01 00:07:26.113     1
A           2018-01-01 00:07:26.113     2018-01-01 00:12:26.113     2
A           2018-01-01 00:12:26.113     2018-01-01 00:17:26.113     3
A           2018-01-01 00:17:26.113     2018-01-01 00:22:26.113     4
A           2018-01-01 00:22:26.113     2018-01-01 00:27:26.113     5
B           2018-11-25 23:59:00.000     2018-11-26 00:04:00.000     1
B           2018-11-26 00:04:00.000     2018-11-26 00:09:00.000     2
B           2018-11-26 00:09:00.000     2018-11-26 00:14:00.000     3
B           2018-11-26 00:14:00.000     2018-11-26 00:19:00.000     4
B           2018-11-26 00:19:00.000     2018-11-26 00:24:00.000     5
B           2018-11-26 00:24:00.000     2018-11-26 00:29:00.000     6
B           2018-11-26 00:29:00.000     2018-11-26 00:34:00.000     7
B           2018-11-26 00:34:00.000     2018-11-26 00:39:00.000     8
B           2018-11-26 00:39:00.000     2018-11-26 00:44:00.000     9
B           2018-11-26 00:44:00.000     2018-11-26 00:49:00.000     10
B           2018-11-26 00:49:00.000     2018-11-26 00:54:00.000     11
B           2018-11-26 00:54:00.000     2018-11-26 00:59:00.000     12
B           2018-11-26 00:59:00.000     2018-11-26 01:04:00.000     13
B           2018-11-26 01:04:00.000     2018-11-26 01:09:00.000     14
B           2018-11-26 01:09:00.000     2018-11-26 01:14:00.000     15
B           2018-11-26 01:14:00.000     2018-11-26 01:19:00.000     16
B           2018-11-26 01:19:00.000     2018-11-26 01:24:00.000     17
B           2018-11-26 01:24:00.000     2018-11-26 01:29:00.000     18
B           2018-11-26 01:29:00.000     2018-11-26 01:34:00.000     19
B           2018-11-26 01:34:00.000     2018-11-26 01:39:00.000     20
B           2018-11-26 01:39:00.000     2018-11-26 01:44:00.000     21
B           2018-11-26 01:44:00.000     2018-11-26 01:49:00.000     22
B           2018-11-26 01:49:00.000     2018-11-26 01:54:00.000     23
B           2018-11-26 01:54:00.000     2018-11-26 01:59:00.000     24
B           2018-11-26 01:59:00.000     2018-11-26 02:04:00.000     25
B           2018-11-26 02:04:00.000     2018-11-26 02:09:00.000     26
B           2018-11-26 02:09:00.000     2018-11-26 02:14:00.000     27
B           2018-11-26 02:14:00.000     2018-11-26 02:19:00.000     28
B           2018-11-26 02:19:00.000     2018-11-26 02:24:00.000     29
B           2018-11-26 02:24:00.000     2018-11-26 02:29:00.000     30
B           2018-11-26 02:29:00.000     2018-11-26 02:34:00.000     31
B           2018-11-26 02:34:00.000     2018-11-26 02:39:00.000     32
B           2018-11-26 02:39:00.000     2018-11-26 02:44:00.000     33
B           2018-11-26 02:44:00.000     2018-11-26 02:49:00.000     34
B           2018-11-26 02:49:00.000     2018-11-26 02:54:00.000     35
B           2018-11-26 02:54:00.000     2018-11-26 02:59:00.000     36
B           2018-11-26 02:59:00.000     2018-11-26 03:04:00.000     37
B           2018-11-26 03:04:00.000     2018-11-26 03:09:00.000     38
B           2018-11-26 03:09:00.000     2018-11-26 03:14:00.000     39
B           2018-11-26 03:14:00.000     2018-11-26 03:19:00.000     40
B           2018-11-26 03:19:00.000     2018-11-26 03:24:00.000     41
B           2018-11-26 03:24:00.000     2018-11-26 03:29:00.000     42
B           2018-11-26 03:29:00.000     2018-11-26 03:34:00.000     43
B           2018-11-26 03:34:00.000     2018-11-26 03:39:00.000     44
B           2018-11-26 03:39:00.000     2018-11-26 03:44:00.000     45
B           2018-11-26 03:44:00.000     2018-11-26 03:49:00.000     46
B           2018-11-26 03:49:00.000     2018-11-26 03:54:00.000     47
B           2018-11-26 03:54:00.000     2018-11-26 03:59:00.000     48
B           2018-11-26 03:59:00.000     2018-11-26 04:04:00.000     49
B           2018-11-26 04:04:00.000     2018-11-26 04:09:00.000     50
B           2018-11-26 04:09:00.000     2018-11-26 04:14:00.000     51
B           2018-11-26 04:14:00.000     2018-11-26 04:19:00.000     52
B           2018-11-26 04:19:00.000     2018-11-26 04:24:00.000     53
B           2018-11-26 04:24:00.000     2018-11-26 04:29:00.000     54
B           2018-11-26 04:29:00.000     2018-11-26 04:34:00.000     55
B           2018-11-26 04:34:00.000     2018-11-26 04:39:00.000     56
B           2018-11-26 04:39:00.000     2018-11-26 04:44:00.000     57
B           2018-11-26 04:44:00.000     2018-11-26 04:49:00.000     58
B           2018-11-26 04:49:00.000     2018-11-26 04:54:00.000     59
B           2018-11-26 04:54:00.000     2018-11-26 04:59:00.000     60
B           2018-11-26 04:59:00.000     2018-11-26 05:04:00.000     61

最后的结果:

sr_no   PROJECT_ID  CR_DATE
1       A           2018-01-01 00:02:26.113
4       A           2018-01-01 00:12:26.113
5       A           2018-01-01 00:23:43.113
7       B           2018-11-26 00:01:26.113
9       B           2018-11-26 05:02:26.113

如果您有很多记录并且每个项目的时间都很长,那么这个查询很可能需要很长时间。在这种情况下,将递归 CTE 的数据转储到临时表中会加快处理速度。

您还可以更改@IntervalMinutes 值以查看其他区间的结果。

【讨论】:

    【解决方案2】:

    我认为这需要递归,因为 5 分钟边界 是根据前一个边界定义的,而不是第一个边界:

    WITH rcte AS (
        SELECT curr.*
        FROM @t AS curr
        WHERE NOT EXISTS (
            SELECT 1
            FROM @t
            WHERE PROJECT_ID = curr.PROJECT_ID AND CR_DATE < curr.CR_DATE
        )
        UNION ALL
        SELECT curr.*
        FROM rcte AS prev
        JOIN @t AS curr ON prev.PROJECT_ID = curr.PROJECT_ID AND curr.CR_DATE >= DATEADD(MINUTE, 5, prev.CR_DATE)
        WHERE NOT EXISTS (
            SELECT 1
            FROM @t
            WHERE PROJECT_ID = curr.PROJECT_ID AND CR_DATE < curr.CR_DATE AND CR_DATE >= DATEADD(MINUTE, 5, prev.CR_DATE)
        )
    )
    SELECT *
    FROM rcte
    

    rCTE 相当简单:

    • 基础部分查找每个项目的第一行(该行不存在更早的行)。
    • 递归部分查找日期大于前一个日期 + 5 分钟的行。这里的技巧是消除除第一行之外的所有行(使用与上述类似的逻辑)。

    Demo on db<>fiddle

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-12-15
      相关资源
      最近更新 更多