【发布时间】:2021-05-04 00:19:28
【问题描述】:
我在 aws redshift 中运行下面的 postgresql 查询。每次我运行此查询时,我都会使用 except 运算符在 daily_table.product_repeat_sub_query 端获得不同的记录数的不同结果。在此期间,daily_table.product_repeat_sub_query 表或 daily_table.daily_sku_t 都不会更新。 daily_table.product_repeat_sub_query 表和 product_repeat_sub_query 查询都具有相同的记录数。 daily_table.daily_sku_t 的架构如下,daily_table.product_repeat_sub_query 中的匹配字段具有相同的数据类型。我还包括了下表中的一些示例记录。有没有人知道,当底层表没有改变时,每次运行这个查询时,except 查询的结果会如何不同?
daily_table.daily_sku_t schema:
customer_uuid string
boardname_12 string
producttype string
productsubtype string
storeid int
product_id string
dateclosed date
Size string
查询:
with product_repeat_sub_query as
(
select
dateclosed, t.product_id, t.storeid, t.producttype, t.productsubtype, t.size, t.boardname_12,
case
when ticketid = first_value(ticketid) over (partition by t.product_id, customer_uuid
ORDER BY
dateclosed ASC rows between unbounded preceding and unbounded following) then 0
else grossreceipts
end as product_repeat_gross, datediff(day,
lag(dateclosed, 1) over (partition by t.boardname_12, customer_uuid, t.product_id
ORDER BY
dateclosed ASC ),
dateclosed) as product_cycle_days
from
daily_table.daily_sku_t t )
select count(*) from
(
select dateclosed, storeid, boardname_12, producttype, productsubtype, size, product_id, product_cycle_days from daily_table.product_repeat_sub_query
except
select dateclosed, storeid, boardname_12, producttype, productsubtype, size, product_id, product_cycle_days from product_repeat_sub_query
);
-- 36843 -- 36887 -- 36188
数据:
daily_table.product_repeat_sub_query
dateclosed storeid boardname_12 producttype productsubtype size product_id product_cycle_days
2021-04-23 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 2
2021-04-24 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 6
2021-04-26 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 8
2021-04-26 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 3
2021-05-01 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 13
2020-06-18 61 FLAV RX WINGER BEVERAGE 100MT 0000265d-6b81-4d79-90cf-xxxxxxxxxxxx 5
2020-06-29
product_repeat_subquery
dateclosed storeid boardname_12 producttype productsubtype size product_id product_cycle_days
2021-04-23 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 2
2021-04-24 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 6
2021-04-26 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 8
2021-04-26 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 3
2021-05-01 427 22RED DRUMER 1T 000011aa-4f03-4f0b-a621-xxxxxxxxxxxx 13
2020-06-18 61 FLAV RX WINGER BEVERAGE 100MT 0000265d-6b81-4d79-90cf-xxxxxxxxxxxx 5
2020-06-29
更新:
with product_repeat_sub_query as
(
select customer_uuid,
dateclosed, t.product_id, t.storeid, t.producttype, t.productsubtype, t.size, t.boardname_12,
case
when ticketid = first_value(ticketid) over (partition by t.product_id, customer_uuid
ORDER BY
dateclosed ASC rows between unbounded preceding and unbounded following) then 0
else grossreceipts
end as product_repeat_gross, datediff(day,
lag(dateclosed, 1) over (partition by t.boardname_12, customer_uuid, t.product_id
ORDER BY
dateclosed ASC,t.boardname_12, customer_uuid, t.product_id ),
dateclosed) as product_cycle_days
from
daily_table.daily_sku_t t
where (t.customer_uuid is not null)
and (trim(t.customer_uuid) != '')
and (t.product_id is not null)
and (trim(t.product_id) != '')
)
select count(*) from
(
select customer_uuid, dateclosed, storeid, boardname_12, producttype, productsubtype, size, product_id, product_cycle_days from daily_table.product_repeat_sub_query
except
select customer_uuid, dateclosed, storeid, boardname_12, producttype, productsubtype, size, product_id, product_cycle_days from product_repeat_sub_query
);
即使在将分区中的所有字段添加到 order by 并过滤 id 字段中的空值或空白之后,我每次仍然得到不同的计数。
【问题讨论】:
标签: postgresql amazon-redshift amazon-redshift-spectrum