【发布时间】:2020-01-19 18:46:59
【问题描述】:
我有一个时间序列表,其中测量值被记录到“宽”行中。行可能包含所有测量值或仅包含一些测量值。然后将其他列设置为NULL。
我想使用timebucket_gapfill() 来“清理”这个表,并确保输出中的每一行在所有列中都有数据,即使基础数据集的某些列有一些空值。
这就是我用一些数据准备表格的方式(来自getting started guide 的模式):
CREATE TABLE conditions (
time TIMESTAMPTZ NOT NULL,
location TEXT NOT NULL,
temperature DOUBLE PRECISION NULL,
humidity DOUBLE PRECISION NULL
);
SELECT create_hypertable('conditions', 'time');
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:14-07', 'office', 70.0, 50.0);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:15-07', 'office', 71.0, null);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:16-07', 'office', 72.0, 48.0);
-- gap at 2019-07-10 05:02:17-07
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:18-07', 'office', 72.0, 48.0);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:18.8-07', 'office', 72.1, NULL);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:19.2-07', 'office', NULL, 46.0);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES ('2019-07-10 05:02:20-07', 'office', 73.0, 45.0);
这就是我查询它的方式:
SELECT
time_bucket_gapfill('1000ms', time,
start => '2019-07-10 05:02:13',
finish => '2019-07-10 05:02:21'
) as ival,
count(*) as samplesUsed,
interpolate(avg(temperature)) as lineartemperature,
interpolate(avg(humidity)) as linearhumidity
FROM conditions
GROUP BY ival
ORDER BY ival;
输出是:
ival | samplesused | lineartemperature | linearhumidity
------------------------+-------------+-------------------+----------------
2019-07-10 05:02:13-07 | | |
2019-07-10 05:02:14-07 | 1 | 70 | 50
2019-07-10 05:02:15-07 | 1 | 71 |
2019-07-10 05:02:16-07 | 1 | 72 | 48
2019-07-10 05:02:17-07 | | 72.025 | 48
2019-07-10 05:02:18-07 | 2 | 72.05 | 48
2019-07-10 05:02:19-07 | 1 | | 46
2019-07-10 05:02:20-07 | 1 | 73 | 45
- 我明白为什么第一行是空的 - 数据集中没有数据。
- 在 5:02:17,当数据集中没有行时,插值工作正常。
- 但是,在 5:02:15 和 5:02:19,基础行是“部分”的,数据库没有使用前一行和后一行的值分别插入湿度和温度的结果。
如何编写查询以返回所有测量列的插值?
【问题讨论】:
标签: sql postgresql time-series timescaledb