【问题标题】:Sample time serie by time interval with Hive QL and calculate jumps使用 Hive QL 按时间间隔对时间序列进行采样并计算跳跃
【发布时间】:2015-03-05 13:19:09
【问题描述】:

我的表格中有时间序列数据。基本上每一行都有一个时间戳和一个值。 数据的频率是绝对随机的。

我想以给定的频率对其进行采样,并为每个频率提取有关它的相关信息:最小、最大、最后、更改(相对先前)、返回(更改/先前)甚至更多(计数... )

这是我的意见:

08:00:10, 1
08:01:20, 2
08:01:21, 3
08:01:24, 5
08:02:24, 2

我想得到以下 1 分钟采样的结果(ts、min、max、last、change、return):

ts        m  M  L  Chg   Return   
08:01:00, 1, 1, 1, NULL, NULL
08:02:00, 2, 5, 5, 4,    4
08:03:00, 2, 2, 2, -3,   -0.25

【问题讨论】:

    标签: hadoop hive time-series hiveql


    【解决方案1】:

    你可以用这样的东西来做(cmets inline):

    SELECT
        min
      , mn
      , mx
      , l
      , l - LAG(l, 1) OVER (ORDER BY min) c
        -- This might not be the right calculation. Unsure how -0.25 was derived in question.
      , (l - LAG(l, 1) OVER (ORDER BY min)) / (LAG(l, 1) OVER (ORDER BY min)) r
    FROM
    (
      SELECT
          min
        , MIN(val) mn
        , MAX(val) mx
        -- We can take MAX here because all l's (last values) for the minute are the same.
        , MAX(l) l
      FROM
      (
        SELECT
            min
          , val
          -- The last value of the minute, ordered by the timestamp, using all rows.
          , LAST_VALUE(val) OVER (PARTITION BY min ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) l
        FROM
        (
          SELECT
              ts
            -- Drop the seconds and go back one minute by converting to seconds,
            -- subtracting 60, and then going back to a shorter string format.
            -- 2000-01-01 is a dummy date just to enable the conversion.
            , CONCAT(FROM_UNIXTIME(UNIX_TIMESTAMP(CONCAT("2000-01-01 ", ts), "yyyy-MM-dd HH:mm:ss") + 60, "HH:mm"), ":00") min       
            , val
          FROM
            -- As from the question.
            21908430_input a
        ) val_by_min
      ) val_by_min_with_l
      GROUP BY min
    ) min_with_l_m_M
    ORDER BY min
    ;
    

    结果:

    +----------+----+----+---+------+------+
    | min      | mn | mx | l | c    | r    |
    +----------+----+----+---+------+------+
    | 08:01:00 | 1  | 1  | 1 | NULL | NULL |
    | 08:02:00 | 2  | 5  | 5 | 4    | 4    |
    | 08:03:00 | 2  | 2  | 2 | -3   | -0.6 |
    +----------+----+----+---+------+------+
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2011-07-02
      • 1970-01-01
      • 1970-01-01
      • 2013-11-17
      • 2022-09-29
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多