【发布时间】:2020-11-14 00:37:49
【问题描述】:
我的数据缺少时间序列值不规则每个用户,我想要使用 BigQuery 标准 SQL 以一定的间隔通过线性插值对其进行转换。
+------+---------------------+-------+
| name | time | value |
+------+---------------------+-------+
| Jane | 2020-11-14 09:01:00 | 3 |
| Jane | 2020-11-14 09:05:00 | 5 |
| Jane | 2020-11-14 09:07:00 | 1 |
| Jane | 2020-11-14 09:09:00 | 8 |
| Jane | 2020-11-14 09:10:00 | 4 |
| Kay | 2020-11-14 09:01:00 | 7 |
| Kay | 2020-11-14 09:04:00 | 1 |
| Kay | 2020-11-14 09:05:00 | 10 |
| Kay | 2020-11-14 09:09:00 | 6 |
| Kay | 2020-11-14 09:10:00 | 7 |
+------+---------------------+-------+
我想将其转换如下:
+------+---------------------+-------+-----------------+
| name | time | value | |
+------+---------------------+-------+-----------------+
| Jane | 2020-11-14 09:01:00 | 3 | |
| Jane | 2020-11-14 09:02:00 | 3.5 | <= interpolaetd |
| Jane | 2020-11-14 09:03:00 | 4 | <= interpolaetd |
| Jane | 2020-11-14 09:04:00 | 4.5 | <= interpolaetd |
| Jane | 2020-11-14 09:05:00 | 5 | |
| Jane | 2020-11-14 09:06:00 | 3 | <= interpolaetd |
| Jane | 2020-11-14 09:07:00 | 1 | |
| Jane | 2020-11-14 09:08:00 | 4.5 | <= interpolaetd |
| Jane | 2020-11-14 09:09:00 | 8 | |
| Jane | 2020-11-14 09:10:00 | 4 | |
| Kay | 2020-11-14 09:01:00 | 7 | |
| Kay | 2020-11-14 09:02:00 | 5 | <= interpolaetd |
| Kay | 2020-11-14 09:03:00 | 3 | <= interpolaetd |
| Kay | 2020-11-14 09:04:00 | 1 | |
| Kay | 2020-11-14 09:05:00 | 10 | |
| Kay | 2020-11-14 09:06:00 | 9 | <= interpolaetd |
| Kay | 2020-11-14 09:07:00 | 8 | <= interpolaetd |
| Kay | 2020-11-14 09:08:00 | 7 | <= interpolaetd |
| Kay | 2020-11-14 09:09:00 | 6 | |
| Kay | 2020-11-14 09:10:00 | 7 | |
+------+---------------------+-------+-----------------+
我能问你一些聪明的解决方案吗?
补充:这是this stackoverflow question 的应用问题。它非常相似,但不同之处在于该数据是时间序列数据,它具有每个用户的名称。
谢谢。
【问题讨论】:
标签: sql google-bigquery interpolation missing-data