【问题标题】:Get Last Value Based on Timestamp and Value Group in Snowflake根据 Snowflake 中的时间戳和值组获取最后一个值
【发布时间】:2021-08-26 10:03:42
【问题描述】:

我正在尝试使用 Snowflake 根据时间戳获取组的最后一个值。

我有下表,

| ISSUE_ID | ISSUE_ID | FIELD_TIME | FIELD_NAME | FIELD_VALUE| STATUS
| -------- | ---------- | ---------- | ----------| ----------| ----------
|19229| X1|2021-08-01 09:04:35.234000000 +00:00|Status|10010|PENDING
|19229| X1|2021-08-01 09:04:35.234000000 +00:00|Number of Products|55|PENDING
|19229| X1|2021-08-01 09:04:35.234000000 +00:00|Number of SKU not created|34|PENDING
|19229| X1|2021-08-01 09:04:35.234000000 +00:00|Number of SKU live|21|PENDING
|19229| X1|2021-08-01 09:04:35.234000000 +00:00|Date of SKU live|2021-08-12|PENDING
|19229| X1|2021-08-08 06:19:05.209000000 +00:00|Status|10010|PENDING
|19229| X1|2021-08-08 11:43:51.953000000 +00:00|Status|10010|PENDING
|19229| X1|2021-08-12 06:12:15.185000000 +00:00|Number of Products|55|PENDING
|19229| X1|2021-08-12 06:27:32.199000000 +00:00|Number of SKU not created|31|PENDING
|19229| X1|2021-08-12 09:14:16.178000000 +00:00|Number of SKU live|24|PENDING

我希望表格中的每个值都有一列(例如,产品数量、SKU 上线日期等),并且该列仅包含当天的最后一个值。如下表,

| ISSUE_ID | ISSUE_ID | FIELD_TIME | Number of Products | Number of SKU live| Date of SKU live|Number of SKU not created|Work_In_Progress_Date|Pending_Date|Status|Resolution|
|--------|----------|---------- |----------|----------|----------|----------|----------
|19229   | X1       |2021-08-01 |55        |21        |2021-08-01|34        |          |             |PENDING              | null
|19229   | X1       |2021-08-08 |          |          |          |          |          |             |PENDING              | null
|19229   | X1       |2021-08-12 |55        |24        |2021-08-01|31        |2021-08-12|   2021-08-12|PENDING              | null

我尝试过last_value(FIELD_VALUE) over (partition by FIELD_NAME, ISSUE_ID order by field_time),但它给了我 23 行而不是 3 行的重复值。 我也试过lag(),但没有运气。
如果有人知道如何进行这种转换,请帮助我。

这是我的 SQL 代码,

select
       t.ISSUE_ID, t.issue_name,
--        t.created_date,
       t.field_time::date as field_time,
       max(case when field_name = 'Number of Products' then field_value end) as Number_of_Products,
       max(case when field_name = 'Number of SKU live' then field_value end) as Number_of_SKU_Live,
       max(case when field_name = 'Number of SKU not created' then field_value end) as Number_of_SKU_Not_Created,
       max(case when field_name = 'Date of SKU live' then field_value end) as Date_of_SKU_Live,
       max(case when field_value = '10020' then date(t.field_time) end) as Work_In_Progress_Date,
       max(case when field_value = '10010' then date(t.field_time) end) as Pending_Date,
       t.status, t.resolution
       from
(select fh.ISSUE_ID,
       i.issue_name,
       date(i.created_date) as created_date,
       fh.TIME as field_time,
       f.name as field_name,
       fh.value as field_value,
       i.status,
       i.resolution
from JIRA.ISSUE_FIELD_HISTORY fh
         left join JIRA.FIELD f on fh.FIELD_ID = f.ID and f._FIVETRAN_DELETED = 0
         left join (select i0.created as created_date,r.name as resolution, i0.id, i0.key as issue_name, i0.status as status_id, s.name as status
                    from JIRA.issue i0
                             left join JIRA.status s on i0.status = s.ID
                             left join JIRA.RESOLUTION r on i0.RESOLUTION = r.ID
             where i0._FIVETRAN_DELETED = 0
             and i0.key like 'PIM%')
             i on i.id = fh.ISSUE_ID
where fh.ISSUE_ID in (select ID from ISSUE where PROJECT = 10041)
and fh.FIELD_ID in ('customfield_10067', 'customfield_10063', 'customfield_10066', 'customfield_10068', 'status', 'resolution')
-- and issue_name = 'PIM-11'
qualify row_number() over (partition by issue_id, field_time::date, field_name order by field_time desc) = 1
order by field_time) t
group by issue_id, issue_name,created_date, field_time::date, status, resolution

【问题讨论】:

    标签: sql snowflake-cloud-data-platform


    【解决方案1】:

    您可以使用qualify 获取最新值:

    select t.*
    from t
    qualify row_number() over (partition by issue_id, field_time::date, field_name order by field_time desc) = 0;
    

    然后聚合:

    select issue_id, field_time::date,
           max(case when field_name = 'Number of Products' then field_value end) as number_of_products,
           max(case when field_name = 'Number of SKU live' then field_value end) as number_of_sku_live,
           . . . 
    from (select t.*
          from t
          qualify row_number() over (partition by issue_id, field_time::date, field_name order by field_time desc) = 0
         ) t
    group by issue_id, field_time::date
    

    【讨论】:

    • max(case when..) 解决了没有资格的问题。这为我节省了很多时间。谢谢。
    • 在某些情况下,它会从某个“issue_id”返回零行,我该如何克服这个问题? @戈登
    • @Nabaa 。 . .此查询将返回您表中的所有问题,所以我不明白您的评论。
    • 它会复制某些问题的值 - 有些使用零。我不确定这个结果是如何发生的 - 如果您想检查它,我编辑了实际问题以粘贴我的代码。谢谢
    猜你喜欢
    • 2023-02-10
    • 2021-09-18
    • 2022-12-06
    • 2018-05-04
    • 2021-06-12
    • 2021-06-01
    • 1970-01-01
    • 2022-12-17
    • 1970-01-01
    相关资源
    最近更新 更多