【问题标题】:Summary of runtimes from large table大表的运行时摘要
【发布时间】:2014-12-11 04:44:03
【问题描述】:

我有一个包含 5 亿行和 25,000 个唯一 tag_id 的表,它看起来像:

tag_id  event_time  event_value reason_type
10087   2011-01-01 04:31:28.000 0   NULL
10087   2011-01-01 18:03:28.000 0   NULL
10087   2011-01-02 07:35:27.000 1   NULL
10087   2011-01-02 21:07:27.000 0   NULL
10087   2011-01-03 10:39:27.000 1   NULL
10087   2011-01-04 00:11:27.000 1   NULL

对于特定的tag_id,0代表电机关闭,1代表电机开启。

系统会随机轮询电机的状态,或在状态发生变化时轮询。

我想总结一下,显示电机运行了多少时间。喜欢:

tag_id  date runtime_mins
10087 2011-01-04 3600
10087 2011-01-05 2456
10087 2011-01-06 2321

感谢您的想法和帮助!

【问题讨论】:

  • SELECT ... GROUP BY ... 需要多长时间(使用 Tag_id 和日期的索引)?您希望多久生成一次此结果?如果您希望频繁地获得快速结果,您可以创建一个触发器,在添加新记录后立即使用所需的聚合填充另一个表,但是,这可能会在添加新数据时导致轻微延迟。
  • 如果在 event_value 发生变化(0 到 1(或)1 到 0)时添加新条目。还是间隔时间??
  • 问题不是 SELECT,而是运行时。新条目可以在随机时间间隔发生(状态保持不变),也可以在状态发生变化时发生。
  • 1 表示电机已打开,0 表示电机已关闭,您需要以分钟为单位的开启和关闭时间之间的差异。对吗?
  • 1 开启,0 关闭。每天的总“开启”时间。

标签: sql tsql sql-server-2008-r2


【解决方案1】:
DECLARE @tmp AS TABLE
        (
          tag_id INT ,
          event_time DATETIME ,
          event_value INT
        )
    INSERT  INTO @tmp
    VALUES  ( 10087, '2011-01-01 04:31:28.000', 1 ),
            ( 10087, '2011-01-01 10:31:28.000', 0 ),
            ( 10087, '2011-01-01 18:03:28.000', 1 ),
            ( 10087, '2011-01-02 07:35:27.000', 1 ),
            ( 10087, '2011-01-02 21:07:27.000', 0 ),
            ( 10087, '2011-01-03 10:39:27.000', 1 ),
            ( 10087, '2011-01-04 00:11:27.000', 1 )
    ------------------------------------------------------------------
    --create temp table for data 'turned on'
    IF OBJECT_ID('Tempdb..#Turnedon') IS NOT NULL 
        DROP TABLE #Turnedon
    SELECT  T.tag_id ,
            CONVERT(DATE, T.event_time) AS Edate ,
            CONVERT(TIME, T.event_time) AS Etime ,
            ROW_NUMBER() OVER ( PARTITION BY T.tag_id, CONVERT(DATE, T.event_time) ORDER BY T.event_time ) RN
    INTO    #Turnedon
    FROM    @tmp AS T
    WHERE   T.event_value = 1
    ------------------------------------------------------------------
    --create temp table for data 'turned off'
    IF OBJECT_ID('Tempdb..#Turnedoff') IS NOT NULL 
        DROP TABLE #Turnedoff
    SELECT  T.tag_id ,
            CONVERT(DATE, T.event_time) AS Edate ,
            CONVERT(TIME, T.event_time) AS Etime ,
            ROW_NUMBER() OVER ( PARTITION BY T.tag_id, CONVERT(DATE, T.event_time) ORDER BY T.event_time ) RN
    INTO    #Turnedoff
    FROM    @tmp AS T
    WHERE   T.event_value = 0
    ------------------------------------------------------------------
    --Create temp table for catalog with unique dates and tag_id
    IF OBJECT_ID('Tempdb..#Catalog') IS NOT NULL 
        DROP TABLE #Catalog
    SELECT DISTINCT
            T.tag_id ,
            CONVERT(DATE, T.event_time) AS Edate
    INTO    #Catalog
    FROM    @tmp AS T
    -------------------------------------------------------------------
/* 
row number helps to determine if on-off was done more than one time. 
so, if on-off was done more than once then you can aggregate (runtime) final result.
but also, it's better before aggregation insert data in temp table due to huge data amount
*/

    SELECT  C.tag_id ,
            C.Edate ,
            COALESCE(T.RN, 1) AS [event id] ,
            COALESCE(T.Etime, CONVERT(TIME, '00:00:00')) AS [turned on] ,
            COALESCE(T2.Etime, CONVERT(TIME, '23:59:59')) AS [turned off] ,
            DATEDIFF(MINUTE, COALESCE(T.Etime, CONVERT(TIME, '00:00:00')),
                     COALESCE(T2.Etime, CONVERT(TIME, '23:59:59'))) Runtime
    FROM    #Catalog AS C
            LEFT JOIN #Turnedon AS T ON C.tag_id = T.tag_id
                                        AND C.Edate = T.Edate
            LEFT JOIN #Turnedoff AS T2 ON C.tag_id = T2.tag_id
                                          AND C.Edate = T2.Edate
                                          AND COALESCE(T.RN, 1) = T2.RN

【讨论】:

  • 哦,这太接近了!但是,如果您每天有多个开/关事件......它只计算第一个/关时段。
  • @user1745767 这是因为提供的数据在 2011 年 1 月 1 日有两个“关闭”,而在该日期“打开”有两个“无数据”。表示如果电机在第二次“关闭”之前的同一日期关闭一次并且之后没有“打开”,则第二次“关闭”不适用。
  • @user1745767 我把快照放在正常情况下,在同一日期你有两个“关闭”和一个“打开”,否则第二个“关闭”将被忽略。
  • 数据集是一个样本,我有5亿行!每天有多个开/关事件。我很欣赏你的想法。
  • @user1745767,它也适用于多个开/关,如果在“关闭”之后将被“打开”,否则将被忽略。例如,“on-off-on-off-on”将起作用,“on-off-on-off-off”将一直工作到第三次“off”,因为在“off”之后没有针对“on”的数据。最后一个“事件 ID”编号将显示该日期最后一个“关闭”而不是“开启”
【解决方案2】:
--in case when something like "off-off-on-off-of-on", just example of data normalization
 DECLARE    @tmp AS TABLE
        (
          tag_id INT ,
          event_time DATETIME ,
          event_value INT
        )
    INSERT  INTO @tmp
    VALUES  ( 10087, '2011-01-01 04:31:28.000', 1 ),
            ( 10087, '2011-01-01 10:31:28.000', 0 ),
            ( 10087, '2011-01-01 18:03:28.000', 1 ),
            ( 10087, '2011-01-02 02:35:27.000', 1 ),
            ( 10087, '2011-01-02 07:35:27.000', 1 ),
            ( 10087, '2011-01-02 11:07:27.000', 0 ),
            ( 10087, '2011-01-02 21:07:27.000', 0 )
    ,       ( 10087, '2011-01-03 10:39:27.000', 1 ),
            ( 10087, '2011-01-04 00:11:27.000', 1 )
        ------------------------------------------------------------------
        --create temp table for data 'turned on'
    IF OBJECT_ID('Tempdb..#Turnedon') IS NOT NULL 
        DROP TABLE #Turnedon
    SELECT  T.tag_id ,
            T.Edate ,
            ( CASE WHEN t.event_value = 0
                        AND t.prev_event_value = 0 THEN t.prev_Etime
                   ELSE t.Etime
              END ) AS Etime ,
            ( CASE WHEN t.event_value = 0
                        AND t.prev_event_value = 0
                   THEN 'previous ''off'' was not executed'
                   ELSE NULL
              END ) AS event_desc ,
            ROW_NUMBER() OVER ( PARTITION BY T.tag_id, Edate ORDER BY Etime ) RN
    INTO    #Turnedon
    FROM    ( SELECT    T.tag_id ,
                        CONVERT(DATE, T.event_time) AS Edate ,
                        CONVERT(TIME, T.event_time) AS Etime ,
                        T.event_value ,
                        lag(T.event_value) OVER ( PARTITION BY T.tag_id,
                                                  CONVERT(DATE, T.event_time) ORDER BY T.event_time ) AS prev_event_value ,
                        lag(CONVERT(TIME, T.event_time)) OVER ( PARTITION BY T.tag_id,
                                                                CONVERT(DATE, T.event_time) ORDER BY T.event_time ) AS prev_Etime
              FROM      @tmp AS T
            ) T
    WHERE   T.event_value = 1
            OR ( T.event_value = 0
                 AND T.prev_event_value = 0
               )
        ------------------------------------------------------------------
        --create temp table for data 'turned off'
    IF OBJECT_ID('Tempdb..#Turnedoff') IS NOT NULL 
        DROP TABLE #Turnedoff
    SELECT  T.tag_id ,
            T.Edate ,
            ( CASE WHEN t.event_value = 1
                        AND t.prev_event_value = 1 THEN t.prev_Etime
                   ELSE t.Etime
              END ) AS Etime ,
            ( CASE WHEN t.event_value = 1
                        AND t.prev_event_value = 1
                   THEN '''on'' was not executed'
                   ELSE NULL
              END ) AS event_desc ,
            ROW_NUMBER() OVER ( PARTITION BY T.tag_id, Edate ORDER BY Etime ) RN
    INTO    #Turnedoff
    FROM    ( SELECT    T.tag_id ,
                        CONVERT(DATE, T.event_time) AS Edate ,
                        CONVERT(TIME, T.event_time) AS Etime ,
                        T.event_value ,
                        lag(T.event_value) OVER ( PARTITION BY T.tag_id,
                                                  CONVERT(DATE, T.event_time) ORDER BY T.event_time ) AS prev_event_value ,
                        lag(CONVERT(TIME, T.event_time)) OVER ( PARTITION BY T.tag_id,
                                                                CONVERT(DATE, T.event_time) ORDER BY T.event_time ) AS prev_Etime
              FROM      @tmp AS T
            ) T
    WHERE   T.event_value = 0
            OR ( T.event_value = 1
                 AND T.prev_event_value = 1
               )                                     
        ------------------------------------------------------------------
        --Create temp table for catalog with unique dates and tag_id
    IF OBJECT_ID('Tempdb..#Catalog') IS NOT NULL 
        DROP TABLE #Catalog
    SELECT DISTINCT
            T.tag_id ,
            CONVERT(DATE, T.event_time) AS Edate
    INTO    #Catalog
    FROM    @tmp AS T
        -------------------------------------------------------------------
    /* 
    row number helps to determine if on-off was done more than one time. 
    so, if on-off was done more than once then you can aggregate (runtime) final result.
    but also, it's better before aggregation insert data in temp table due to huge data amount
    */       
    SELECT  C.tag_id ,
            C.Edate ,
            COALESCE(T.RN, 1) AS [event id] ,
            COALESCE(T.Etime, CONVERT(TIME, '00:00:00')) AS [turned on] ,
            COALESCE(T2.Etime, CONVERT(TIME, '23:59:59')) AS [turned off] ,
            DATEDIFF(MINUTE, COALESCE(T.Etime, CONVERT(TIME, '00:00:00')),
                     COALESCE(T2.Etime, CONVERT(TIME, '23:59:59'))) Runtime ,
            COALESCE(t2.event_desc, t.event_desc, '') AS event_desc
    FROM    #Catalog AS C
            LEFT JOIN #Turnedon AS T ON C.tag_id = T.tag_id
                                        AND C.Edate = T.Edate
            LEFT JOIN #Turnedoff AS T2 ON C.tag_id = T2.tag_id
                                          AND C.Edate = T2.Edate
                                          AND COALESCE(T.RN, 1) = T2.RN

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-01-31
    • 2012-12-07
    • 2010-10-10
    • 1970-01-01
    • 2015-11-14
    • 2016-10-03
    • 1970-01-01
    相关资源
    最近更新 更多