【问题标题】:Calculating town distance excluding the locations with time in SQL计算城镇距离,不包括SQL中的时间位置
【发布时间】:2022-01-13 18:38:45
【问题描述】:

我有两张桌子

位置:有城镇名称、位置纬度/经度和城镇开放日期

|--------------|------------|------------|-------------------|
|    Towns     |  latitude  | longitude  |  establish_time   |
|--------------|------------|------------|-------------------|
|  Town_A      |  72.92629  | -12.89272  |    2021-07-21     |
|  Town_B      |  93.62789  | -83.10172  |    2021-08-21     |
|  Town_C      |  68.92612  | -67.17242  |    2021-09-21     |
|--------------|------------|------------|-------------------|

用户:有user_id、用户纬度/经度、购买电影票的城镇、购买时间

|------------|------------|------------|--------------|-----------------|
|   user_id  |  latitude  | longitude  |    Towns     |  purchase_time  |
|------------|------------|------------|--------------|-----------------|
|    1       |  21.89027  | -53.03772  |   Town_A     |    2021-08-19   |
|    1       |  23.87847  | -41.78172  |   Town_C     |    2021-09-29   |
|    1       |  77.87092  | -96.39242  |   Town_A     |    2021-11-07   |
|    2       |  83.37847  | -62.78278  |   Town_B     |    2021-10-21   |
|    3       |  89.81924  | -80.73892  |   Town_B     |    2021-12-07   |
|    3       |  27.87282  | -18.39183  |   Town_A     |    2021-08-23   |
|------------|------------|------------|--------------|-----------------|

我正在尝试在购票时找到从用户位置到所有其他城镇的最短距离

这是我的 SQL(雪花):

SELECT 
        u.*,
        seq4() as seq,
        l.town as nearest_town,
        haversine(u.latitude, u.longitude, l.latitude, l.longitude)
    FROM User AS u, location AS l
    QUALIFY row_number() OVER (PARTION BY u.seq ORDER BY haversine(u.latitude, u.longitude, l.latitude, l.longitude)) = 1

我得到的结果实际上是基于用户位置如下:

|------------|------------|------------|--------------|-----------------|
|   user_id  |  latitude  | longitude  |    Towns     | nearest_town    |
|------------|------------|------------|--------------|-----------------|
|    1       |  21.89027  | -53.03772  |   Town_C     |   Town_C        |
|    1       |  23.87847  | -41.78172  |   Town_C     |   Town_A        | 
|    1       |  77.87092  | -96.39242  |   Town_A     |   Town_B        | <--- This should be Town_A because Town_B was not present at the time of purchase
|    2       |  83.37847  | -62.78278  |   Town_B     |   Town_C        |
|    3       |  89.81924  | -80.73892  |   Town_B     |   Town_A        |
|    3       |  27.87282  | -18.39183  |   Town_A     |   Town_C        | <--- This should be Town_A because Town_B was not present at the time of purchase
|------------|------------|------------|--------------|-----------------|

即使在购买时间时城镇不存在,上述查询也会产生最近的城镇

在计算最近的城镇时如何排除未开放的城镇

【问题讨论】:

    标签: sql snowflake-cloud-data-platform window-functions


    【解决方案1】:

    将您的隐式 CROSS JOIN(由您的 FROM 子句中以逗号分隔的表列表暗示)切换为 INNER JOIN,其中 ON 条件是您要添加的日期限制。

    类似

    FROM Users AS u
        INNER JOIN location AS l
            ON l.establish_time <= u.purchase_time
    

    这在英文中说“在加入这两个表时,仅考虑购买此票之前建立的位置”。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-08-09
      • 2011-04-03
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-10-26
      相关资源
      最近更新 更多