【发布时间】:2022-01-11 04:11:36
【问题描述】:
我有一个用户表,其中包含有关 user_id、用户购买门票的地点以及用户购买门票的时间的信息。
用户:
|------------|-------------|----------------------|
| user_id | place | purchase_time |
|------------|-------------|----------------------|
| 1 | New York | 2021-11-27:17:00:21 |
| 1 | Chicago | 2021-11-25:19:00:21 |
| 1 | Chicago | 2021-11-23:03:00:21 |
| 1 | Washington | 2021-11-21:07:00:21 |
| 1 | Washington | 2021-11-19:12:00:21 |
| 1 | Washington | 2021-11-17:00:00:21 |
| 1 | Washington | 2021-11-15:23:00:21 |
| 1 | Washington | 2021-11-12:21:00:21 |
| 2 | Chicago | 2021-09-25:01:00:21 |
| 2 | Milwaukee | 2021-09-24:02:00:21 |
| 2 | Milwaukee | 2021-09-23:03:00:21 |
| 2 | New York | 2021-09-22:19:00:21 |
| 2 | Chicago | 2021-09-21:01:00:21 |
| 3 | Milwaukee | 2021-10-27:12:31:21 |
| 3 | Washington | 2021-10-24:07:01:23 |
| 3 | Chicago | 2021-10-21:01:78:89 |
|------------|-------------|----------------------|
我想添加一个新列,显示用户在购票时访问最多的地方。表想(雪花):
|------------|-------------|----------------------|---------------------|
| user_id | place | purchase_time | most_visited_place |
|------------|-------------|----------------------|---------------------|
| 1 | New York | 2021-11-27:17:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Chicago | 2021-11-25:19:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Chicago | 2021-11-23:03:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Washington | 2021-11-21:07:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Washington | 2021-11-19:12:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Washington | 2021-11-17:00:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Washington | 2021-11-15:23:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 1 | Washington | 2021-11-12:21:00:21 | Washington | <--- Washington, because at purchase_time This place was most visited by the user
| 2 | Chicago | 2021-09-21:01:00:25 | Chicago | <-- tie, break. Both Chicago and Milwaukee were most visited then take the recent most visited
| 2 | Milwaukee | 2021-09-21:02:00:24 | Milwaukee | <--- Milwaukee, because at purchase_time This place was most visited by the user
| 2 | Milwaukee | 2021-09-21:03:00:23 | Milwaukee | <--- Milwaukee, because at purchase_time This place was most visited by the user
| 2 | New York | 2021-09-21:19:00:22 | New York | <-- tie, break. Both Chicago and New York were most visited then take the recent most visited
| 2 | Chicago | 2021-09-21:01:00:21 | Chicago | <--- Chicago, because at purchase_time This place was most visited by the user
| 3 | Milwaukee | 2021-10-27:12:31:21 | Milwaukee |
| 3 | Washington | 2021-10-24:07:01:23 | Washington |
| 3 | Chicago | 2021-10-21:01:78:89 | Chicago |
|------------|-------------|----------------------|---------------------|
【问题讨论】:
-
您似乎在问题中添加了 2 个相同的表格。请更新您的问题以显示您想要达到的结果
-
@NickW 到第一个表(
Users),我想根据用户访问最多的地方添加一个新字段most_visited_place -
你可以尝试使用
mode窗口函数,它也可以处理varchar。但只要我无法访问 Snowflake 平台,我就没有测试它。
标签: sql snowflake-cloud-data-platform window-functions