【问题标题】:SQL - I want to get the timestamps in query in 30 minute intervalsSQL - 我想以 30 分钟的间隔获取查询中的时间戳
【发布时间】:2019-09-12 23:35:07
【问题描述】:

我有如下查询,

    SELECT DISTINCT table_1.id,
       table_1.time_utc
       table_1.city_uuid,
       cast(table_2.score_rate as decimal(5,3)) as score_rate
FROM integrated_delivery.trip_table_1_fact table_1,
     integrated_product.driver_score_v2 table_2
WHERE table_1.id = table_2.id
  AND table_1.city_uuid = table_2.city_id
  AND table_1.day = date '2019-04-01'
  AND table_2.extract_dt = 20190331
  AND EXISTS
    (SELECT NULL
     FROM table_3
     WHERE table_1.id = table_3.id
       AND table_1.time_utc >= table_3.start_time_utc
       AND table_1.time_utc <= table_3.end_time_utc)

我想更改此查询,它以 30 分钟的间隔返回 table_1.offer_time_utc。

Table_1 看起来像示例行

       id               time_utc    
b7-19b36a410ab0  2019-04-16 22:00:09.415
53-9127667e288e  2019-04-17 01:06:16.590
6b-a96c3ea196c4  2019-04-16 22:00:09.908

Table_3 看起来像

           id          start_time_utc       end_time_utc    
35-e512d080e5d3 2019-01-29 02:00:00.000 2019-01-29 03:30:00.000
94-07e7036c1e4b 2019-01-29 01:30:00.000 2019-01-29 02:30:00.000
7d-20736d277064 2019-01-29 01:00:00.000 2019-01-29 03:30:00.000

我想在哪里调整上述查询,以便它以 30 分钟的间隔提取所有记录或行,并且还有一个表示该间隔的列?

有点像

  interval       
-------------------    
2010-11-16 10:30:00  
2010-11-16 10:35:00
2010-11-16 10:40:00   
2010-11-16 10:45:00
2010-11-16 10:50:00   
2010-11-16 10:55:00 

预期的输出基本上是我在 table_1 样本中已经拥有的,但间隔如下:

Id               Interval     ( time_utc)
b7-19b36a410ab0  2010-11-16 10:30:00  
53-9127667e288e  2010-11-16 11:00:00
6b-a96c3ea196c4  2010-11-16 11:30:00  

谢谢!

【问题讨论】:

  • 请向我们展示您的完整预期结果。谢谢。
  • 当然!刚刚添加了所需的示例输出! @âńōŋŷXmoůŜ
  • 您正在查询哪个数据库?您同时标记了 Presto 和 Postgres。对于 Presto——见stackoverflow.com/a/47741138/65458

标签: sql timestamp data-analysis presto


【解决方案1】:

我会使用公用表表达式 (CTE) 并每 30 分钟创建一个日期时间间隔。您可以在 dbfiddle 中查看我的示例数据:http://www.sqlfiddle.com/#!4/bf5a7/18

    WITH interval_dates as
    (select  timestamp '2019-04-16 00:00:00' 
                    + NUMTODSINTERVAL(30*rownum-30,'MINUTE')  as from_interval,
     timestamp '2019-04-16 00:00:00' 
                    + NUMTODSINTERVAL(30*rownum,'MINUTE') as to_interval
     from dual connect by level <= 2000)
    select t1.*, dt.from_interval
    from interval_dates dt  
    ,(SELECT DISTINCT table_1.id,
       table_1.time_utc
       table_1.city_uuid,
       cast(table_2.score_rate as decimal(5,3)) as score_rate
FROM integrated_delivery.trip_table_1_fact table_1,
     integrated_product.driver_score_v2 table_2
WHERE table_1.id = table_2.id
  AND table_1.city_uuid = table_2.city_id
  AND table_1.day = date '2019-04-01'
  AND table_2.extract_dt = 20190331
  AND EXISTS
    (SELECT NULL
     FROM table_3
     WHERE table_1.id = table_3.id
       AND table_1.time_utc >= table_3.start_time_utc
       AND table_1.time_utc <= table_3.end_time_utc)) t1
    where t1.time_utc >= dt.from_interval and t1.time_utc < dt.to_interval

示例结果:

ID              TIME_UTC                FROM_INTERVAL
b7-19b36a410ab0 2019-04-16 22:00:09.0   2019-04-16 22:00:00.0
6b-a96c3ea196c4 2019-04-16 22:00:09.0   2019-04-16 22:00:00.0
53-9127667e288e 2019-04-17 01:06:16.0   2019-04-17 01:00:00.0

【讨论】:

  • 谢谢 - 一个问题我希望每个间隔都有不同的 IDS - 这意味着我希望相同的 ID 存在于不同的 FROM_INTERVAL FIELDS OR TIMES 但不在同一个 FROM_INTERVAL 存储桶中。含义 ID B 可以存在于不同的间隔中,但不能在同一个 @âńōŋŷXmoůŜ 中重复
【解决方案2】:

TL;DR

以下构造为任何时间戳生成 30 分钟的下限:

date_trunc('hour', table_1.time_utc) + (
    CASE
        WHEN (extract(minute from table_1.time_utc) >= 30) THEN
            '30 minutes'::interval
        ELSE
            '0'::interval
    END
)

加长版

适用于您的案例:

SELECT DISTINCT table_1.id,
       table_1.time_utc,
       date_trunc('hour', table_1.time_utc) + CASE
           WHEN (extract(minute from table_1.time_utc) >= 30) THEN '30 minutes'::interval
           ELSE '0'::interval
       END AS time_utc_aligned,
       table_1.city_uuid,
       cast(table_2.score_rate as decimal(5,3)) as score_rate
FROM integrated_delivery.trip_table_1_fact table_1,
     integrated_product.driver_score_v2 table_2
WHERE table_1.id = table_2.id
      AND table_1.city_uuid = table_2.city_id
      AND table_1.day = date '2019-04-01'
      AND table_2.extract_dt = 20190331
      AND EXISTS (
        SELECT NULL
        FROM table_3
        WHERE table_1.id = table_3.id
              AND table_1.time_utc >= table_3.start_time_utc
              AND table_1.time_utc <= table_3.end_time_utc
      )
;

...会产生 (with test data):

       id        |        time_utc         |  time_utc_aligned   |              city_uuid               | score_rate 
-----------------+-------------------------+---------------------+--------------------------------------+------------
 53-9127667e288e | 2019-04-17 01:06:16.59  | 2019-04-17 01:00:00 | 909153dc-c1ff-4e65-a32e-c9194ddfbec9 |      4.662
 6b-a96c3ea196c4 | 2019-04-16 22:00:09.908 | 2019-04-16 22:00:00 | b2d402a2-ba2d-483b-a4c0-fae95ee1700c |      2.250
 b7-19b36a410ab0 | 2019-04-16 22:00:09.415 | 2019-04-16 22:00:00 | 889f9aed-f399-4059-b97b-d67b0af0096d |      1.744

如果您有timescale 扩展名,则使用他们的time_bucket C function 会变得更具可读性:

SELECT DISTINCT table_1.id,
       table_1.time_utc,
       time_bucket('30 minutes', table_1.time_utc) AS time_utc_aligned,
       table_1.city_uuid,
       cast(table_2.score_rate as decimal(5,3)) as score_rate
FROM integrated_delivery.trip_table_1_fact table_1,
     integrated_product.driver_score_v2 table_2
WHERE table_1.id = table_2.id
      AND table_1.city_uuid = table_2.city_id
      AND table_1.day = date '2019-04-01'
      AND table_2.extract_dt = 20190331
      AND EXISTS (
        SELECT NULL
        FROM table_3
        WHERE table_1.id = table_3.id
              AND table_1.time_utc >= table_3.start_time_utc
              AND table_1.time_utc <= table_3.end_time_utc
      )
;

【讨论】:

  • 谢谢一个问题,我希望每个间隔都有不同的 IDS - 这意味着我希望相同的 ID 存在于不同的 FROM_INTERVAL FIELDS OR TIMES 但不在同一个 FROM_INTERVAL 存储桶中。含义 ID B 可以存在于与区间不同的位置,但不会在 sam 中重复。 @Ancoron
  • 对不起,我不明白。您能否编辑您的问题并通过示例提供数据?
猜你喜欢
  • 2021-07-16
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2014-01-26
  • 2016-09-27
  • 1970-01-01
  • 2020-05-05
  • 2022-01-15
相关资源
最近更新 更多