【发布时间】:2022-01-13 18:38:45
【问题描述】:
我有两张桌子
位置:有城镇名称、位置纬度/经度和城镇开放日期
|--------------|------------|------------|-------------------|
| Towns | latitude | longitude | establish_time |
|--------------|------------|------------|-------------------|
| Town_A | 72.92629 | -12.89272 | 2021-07-21 |
| Town_B | 93.62789 | -83.10172 | 2021-08-21 |
| Town_C | 68.92612 | -67.17242 | 2021-09-21 |
|--------------|------------|------------|-------------------|
用户:有user_id、用户纬度/经度、购买电影票的城镇、购买时间
|------------|------------|------------|--------------|-----------------|
| user_id | latitude | longitude | Towns | purchase_time |
|------------|------------|------------|--------------|-----------------|
| 1 | 21.89027 | -53.03772 | Town_A | 2021-08-19 |
| 1 | 23.87847 | -41.78172 | Town_C | 2021-09-29 |
| 1 | 77.87092 | -96.39242 | Town_A | 2021-11-07 |
| 2 | 83.37847 | -62.78278 | Town_B | 2021-10-21 |
| 3 | 89.81924 | -80.73892 | Town_B | 2021-12-07 |
| 3 | 27.87282 | -18.39183 | Town_A | 2021-08-23 |
|------------|------------|------------|--------------|-----------------|
我正在尝试在购票时找到从用户位置到所有其他城镇的最短距离
这是我的 SQL(雪花):
SELECT
u.*,
seq4() as seq,
l.town as nearest_town,
haversine(u.latitude, u.longitude, l.latitude, l.longitude)
FROM User AS u, location AS l
QUALIFY row_number() OVER (PARTION BY u.seq ORDER BY haversine(u.latitude, u.longitude, l.latitude, l.longitude)) = 1
我得到的结果实际上是基于用户位置如下:
|------------|------------|------------|--------------|-----------------|
| user_id | latitude | longitude | Towns | nearest_town |
|------------|------------|------------|--------------|-----------------|
| 1 | 21.89027 | -53.03772 | Town_C | Town_C |
| 1 | 23.87847 | -41.78172 | Town_C | Town_A |
| 1 | 77.87092 | -96.39242 | Town_A | Town_B | <--- This should be Town_A because Town_B was not present at the time of purchase
| 2 | 83.37847 | -62.78278 | Town_B | Town_C |
| 3 | 89.81924 | -80.73892 | Town_B | Town_A |
| 3 | 27.87282 | -18.39183 | Town_A | Town_C | <--- This should be Town_A because Town_B was not present at the time of purchase
|------------|------------|------------|--------------|-----------------|
即使在购买时间时城镇不存在,上述查询也会产生最近的城镇
在计算最近的城镇时如何排除未开放的城镇
【问题讨论】:
标签: sql snowflake-cloud-data-platform window-functions