【问题标题】:Count first time subscribers by week按周计算首次订阅者
【发布时间】:2019-04-09 03:12:11
【问题描述】:

我在 PostgreSQL 10.5 中有一个表 Subscriptions:

id  user_id  starts_at  ends_at
--------------------------------
1   233      02/04/19   03/03/19
2   233      03/04/19   04/03/19
3   296      02/09/19   03/08/19
4   126      02/01/19   02/28/19
5   126      03/01/19   03/31/19
6   922      02/22/19   03/22/19

我想计算每周有多少新订阅者。新订阅者是在该周之前没有订阅条目的任何用户 ID。

edit 我稍微修改了@fubar 解决方案以适应我喜欢的日期格式。我忘记在此处添加的一项说明是,我希望看到有 0 的几周。如何将generate_series 集成到下面的查询中,以便我还可以查看0 订阅者的周数?

SELECT TO_CHAR(date_trunc('week', s.starts_at), 'YYYY-MM-DD') as week, COUNT(*) AS count
FROM subscriptions s
LEFT JOIN subscriptions s1 ON s.user_id = s1.user_id AND s.starts_at > s1.starts_at
WHERE s1.id IS NULL
GROUP BY week
ORDER BY week desc

【问题讨论】:

  • DISTINCT 不是函数,跳过那些多余的括号,直接写count(distinct s.id) 让代码更清晰!

标签: sql postgresql


【解决方案1】:

您可以使用以下查询找到每个用户的第一个订阅:

SELECT s.*
FROM subscriptions s
LEFT JOIN subscriptions s1 ON s.user_id = s1.user_id AND s.starts_at > s1.starts_at
WHERE s1.id IS NULL

然后您可以使用以下查询计算每年/每周的新订阅者数量:

SELECT 
    EXTRACT(YEAR FROM s.starts_at) AS year,
    EXTRACT(WEEK FROM s.starts_at) AS week,
    COUNT(*) AS count
FROM subscriptions s
LEFT JOIN subscriptions s1 ON s.user_id = s1.user_id AND s.starts_at > s1.starts_at
WHERE s1.id IS NULL
GROUP BY year, week;

下面是一个更新的查询,它将我上面的答案与generate_series() 和您首选的星期日期格式相结合。

SELECT 
  TO_CHAR(date_trunc('week', w.date), 'YYYY-MM-DD') AS week, 
  COUNT(DISTINCT s.*) AS count
FROM generate_series('2018-12-31', NOW(), INTERVAL '1 WEEK') w(date)
LEFT JOIN subscriptions s ON s.starts_at BETWEEN w.date AND w.date + INTERVAL '6 DAY'
LEFT JOIN subscriptions s1 ON s.user_id = s1.user_id AND s.starts_at > s1.starts_at
WHERE s1.id IS NULL
GROUP BY w.date;

数据库小提琴:https://www.db-fiddle.com/f/b73AbU3KU6dsfTvXu3mzjz/1

【讨论】:

  • 您的方法有效,谢谢!我怎样才能看到有 0 个新订阅者的几周的价值?我需要使用generate_series吗?
  • 是的。如果您希望每周都列出而不考虑是否有新订阅者,您需要使用generate_series
  • 您知道如何将其应用到您的查询中吗?
【解决方案2】:

我 +1 了 fubar 的解决方案。它适用于所有 RDBMS。

我将只提供另一种方法,它是 Postgres 特定的解决方案,由于 DISTINCT ON

查找用户首次订阅的日期:

select 
    distinct on (s.user_id)

    s.*

from subscriptions s
order by s.user_id, s.starts_at;

输出:

| id  | user_id | starts_at                | ends_at                  |
| --- | ------- | ------------------------ | ------------------------ |
| 4   | 126     | 2019-02-01T00:00:00.000Z | 2019-02-28T00:00:00.000Z |
| 1   | 233     | 2019-01-04T00:00:00.000Z | 2019-03-03T00:00:00.000Z |
| 3   | 296     | 2019-02-09T00:00:00.000Z | 2019-03-08T00:00:00.000Z |
| 6   | 922     | 2019-02-22T00:00:00.000Z | 2019-03-22T00:00:00.000Z |

架构

CREATE TABLE subscriptions (
  id INT NOT NULL,
  user_id INT NOT NULL,
  starts_at DATE,
  ends_at DATE,
  PRIMARY KEY(id)
);

INSERT INTO subscriptions VALUES
  (1, 233, '2019-01-04', '2019-03-03'),
  (2, 233, '2019-03-04', '2019-04-04'),
  (3, 296, '2019-02-09', '2019-03-08'),
  (4, 126, '2019-02-01', '2019-02-28'),
  (5, 126, '2019-03-01', '2019-03-31'),
  (6, 922, '2019-02-22', '2019-03-22');

获取每周新订阅者的数量

现场测试:https://www.db-fiddle.com/f/vhzw4KvANA6Mvi59NDTy3H/0

with first_time
as
(
    select 
        distinct on (s.user_id)

        s.*

    from subscriptions s
    order by s.user_id, s.starts_at
)
select gs.wk, count(ft.*) as new_subscribers_for_the_week
from 
    generate_series('2019-02-25'::date, now()::date, interval '1 week') gs(wk)
left join first_time ft 
    on gs.wk >= ft.starts_at and gs.wk <= ft.ends_at

group by gs.wk
order by gs.wk;

输出:

| wk                       | new_subscribers_for_the_week |
| ------------------------ | ---------------------------- |
| 2019-02-25T00:00:00.000Z | 4                            |
| 2019-03-04T00:00:00.000Z | 2                            |
| 2019-03-11T00:00:00.000Z | 1                            |
| 2019-03-18T00:00:00.000Z | 1                            |
| 2019-03-25T00:00:00.000Z | 0                            |
| 2019-04-01T00:00:00.000Z | 0                            |
| 2019-04-08T00:00:00.000Z | 0                            |

【讨论】:

    猜你喜欢
    • 2019-08-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-09-27
    • 1970-01-01
    相关资源
    最近更新 更多