【发布时间】:2022-01-15 00:34:27
【问题描述】:
我遇到了一些我认为 timescaledb 可以提供帮助的问题。
假设我有这张桌子:
CREATE TABLE purchase
(
id integer NOT NULL DEFAULT nextval('purchase_id_seq'::regclass),
"timestamp" timestamp without time zone NOT NULL,
country character varying(128) COLLATE pg_catalog."default",
product character varying(128) COLLATE pg_catalog."default",
quantity numeric(64,32),
price numeric(64,32)
)
代表购买:
- 身份证
- 购买时间戳
- 发生的国家
- 正在购买的产品
- 购买数量
- 按 1 数量支付的价格
对于每次购买,我要计算(伪代码):
price - last price for given (country, product) where timestamp - timestamp of old record > 5 minutes
例如,如果我有这些购买:
id timestamp country product quantity price
1 2021-12-09 07:12:11.13 US apple 1 1.2
2 2021-12-09 07:13:11.13 US apple 2 1.3
3 2021-12-09 07:19:12.13 US apple 2 1.4
4 2021-12-09 07:20:19.13 US apple 2 0.9
然后我会有这些增量
id timestamp country product quantity price last_price_at_least_five_minutes_ago
1 2021-12-09 07:12:11.13 US apple 1 1.2 NULL
2 2021-12-09 07:13:11.13 US apple 2 1.3 NULL
3 2021-12-09 07:17:12.13 US apple 2 1.4 1.2
4 2021-12-09 07:20:19.13 US apple 2 0.9 1.3
对于每个 CURRENTROW,询问“具有小于 CURRENTROW 的最高时间戳的行的价格 - '5 分钟' 的最简单方法是什么?
我愚蠢地尝试了这个:
SELECT
t1.country,
t1.product,
t1.timestamp,
t1.id,
t1.price,
t2.id AS last_id,
t2.timestamp AS last_timestamp,
t2.price AS last_price
FROM
purchase t1
LEFT JOIN purchase t2
ON
t2.timestamp < t1.timestamp - INTERVAL '5m' AND
t1.country = t2.country AND
t1.product = t2.product
GROUP BY
t1.country,
t1.product,
t1.id,
t1.price,
t1.timestamp,
t2.id,
t2.price,
t2.timestamp
但这会挂起,我确信它做了很多不必要的工作,因为连接的结果集很大(每行 * N,其中 N 是 5m 前的行数)
【问题讨论】: