【问题标题】:SQL - Query within month difference from the first month, counting activitySQL - 查询与第一个月的月差,计算活动
【发布时间】:2022-01-13 10:54:18
【问题描述】:

我有一张如下表;

user_id status month
1 frequent_user 01.04.2020
1 infrequent_user 01.02.2020
2 frequent_user 01.06.2020
3 frequent_user 01.04.2020
3 infrequent_user 01.03.2020
3 frequent_user 01.06.2020
4 frequent_user 01.06.2020

问题是知道在 1m、2m、3m 内的任何时间点有多少新用户转换为frequent_user。 新用户意味着 f.e. user_id 1 在 01.02.2020 有第一个活动,所以它是那里的新用户。并在 2 个月内转换为frequent_user。还有一点,user_id 3 是 01.06.2020 的第二次frequent_user。但是,这不是我感兴趣的。我想第一次知道。

所以输出应该是这样的

month 1m 2m 3m 4m
01.02.2020 0 1 0 0
01.03.2020 1 0 0 0
01.06.2020 2 0 0 0

我不知道如何编写查询。非常感谢你的努力。欣赏任何见解。

【问题讨论】:

  • 你能标记你的 RDBMS 吗?
  • user_id 2 和 4 会发生什么?没有记录他们何时从不经常使用的用户转换为经常使用的用户。您是否会在 1 个月内自动将这些案例计为转化?

标签: sql count case


【解决方案1】:

您可以使用几个 CTE 慢慢地解决如下问题。我已经在 SQL Server 和 PostgreSQL 的 dbfiddle 中提供了实现。

WITH firstconversions AS (
     SELECT *, MIN(month) OVER (PARTITION BY user_id ORDER BY user_id) AS min_month,
               LEAD(status) OVER (PARTITION BY user_id ORDER BY month) AS next_status,
               LEAD(month) OVER (PARTITION BY user_id ORDER BY month) AS next_month,
               COUNT(CASE status WHEN 'infrequent_user' THEN status END) 
               OVER (PARTITION BY user_id ORDER BY month) AS status_count
     FROM conversions
),
nextconversions AS (
SELECT *, DATEDIFF(month, month, next_month) AS duration,
          CASE WHEN status = 'infrequent_user' AND next_status = 'frequent_user' AND month = min_month AND status_count > 0 THEN 'Valid'
               WHEN status = 'frequent_user'AND next_status IS NULL AND month = min_month AND status_count = 0 THEN 'Valid'
               ELSE 'Invalid' END AS group_logic
FROM firstconversions
),
groupconversions AS (
SELECT *,
       CASE WHEN group_logic = 'Valid' AND duration IS NOT NULL THEN CONCAT(duration, 'm')
            WHEN group_logic = 'Valid' AND duration IS NULL THEN CONCAT(1, 'm')
            END AS duration_flag
FROM nextconversions
),
grouped AS (
SELECT duration_flag, month
FROM groupconversions
WHERE duration_flag IS NOT NULL
)
SELECT month, 
       COUNT(CASE WHEN duration_flag = '1m' THEN duration_flag END) AS [1m],
       COUNT(CASE WHEN duration_flag = '2m' THEN duration_flag END) AS [2m],
       COUNT(CASE WHEN duration_flag = '3m' THEN duration_flag END) AS [3m],
       COUNT(CASE WHEN duration_flag = '4m' THEN duration_flag END) AS [4m]
FROM grouped
GROUP BY month
ORDER BY month;

SQL Server 解决方案:Demo

PostgreSQL 解决方案:Demo

结果

月 | 1m | 2m | 3m | 4m :--------- | -: | -: | -: | -: 2020-02-01 | 0 | 1 | 0 | 0 2020-03-01 | 1 | 0 | 0 | 0 2020-06-01 | 2 | 0 | 0 | 0

【讨论】:

    猜你喜欢
    • 2015-06-17
    • 1970-01-01
    • 1970-01-01
    • 2017-02-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多