【问题标题】:Get an average monthly view of active members (Postgresql)获取活跃成员的平均每月视图 (Postgresql)
【发布时间】:2020-07-27 13:17:18
【问题描述】:

我正在处理成员数据。我有负责的教练、教练对象的进入、退出状态和日期。因为有些教练可能会在一个月内毕业/离开,所以我想计算一个每日人数,然后得到每个教练的月活跃成员平均值。这意味着我需要将前几个月的所有受训者记入账户,这些受训者在当月仍然活跃。这是我的数据:

我正在考虑首先创建一个变量,以便获取每位教练的每日活跃会员数。这是我的第一种方法:

with all_years as (
    select y.year, m.month, d.day
    from generate_series(2019, 2022) as y(year)
             cross join generate_series(1, 12) as m(month)
             cross join generate_series(1, 31) as d(day) --<<*not sure how to adjust for days with less than 31 days??*
        select ay.*, coach, coachee, entry_status, entry_date, exit_reason, exit_date, sum(count) over (partition by ay.coach order by ay.year, ay.month, ay.day)
    from all_years ay
        left join table t
    on --.... *not sure what I can join on in this case*;

我对更简单的方法持开放态度,这个逻辑只是一个想法。

【问题讨论】:

    标签: postgresql date sum pivot average


    【解决方案1】:

    您可以cross join 列出不同教练的所有日期以生成组合,然后将表格与left join 一起带来:

    select d.dt, c.coach, count(t.coach) no_coachees
    from (select distinct coach from mytable) c
    cross join generate_series('2019-01-01'::date, '2022-12-31'::date, '1 day':: interval) d(dt)
    left join mytable t on t.coach = c.coach and t.entry_date <= d.dt and t.exit_date > d.dt
    group by d.dt, c.coach
    

    然后您可以使用另一个级别的聚合来获得每月平均值:

    select date_trunc('month', d.dt) d_month, coach, avg(no_coachees) avg_coaches
    from (
        select d.dt, c.coach, count(t.coach) no_coachees
        from (select distinct coach from mytable) c
        cross join generate_series('2019-01-01'::date, '2022-12-31'::date, '1 day':: interval) d(dt)
        left join mytable t on t.coach = c.coach and t.entry_date <= d.dt and t.exit_date > d.dt
        group by d.dt, c.coach
    ) t
    group by date_trunc('month', d.dt), coach
    

    【讨论】:

    • 它似乎工作,只是当前月份似乎太低了?是否需要对当前月份进行任何调整
    • @Aww:查询中当前月份没有什么特别之处,所以应该和其他月份的结果一致。
    • 有些受训者只有一种状态,在数据中可见,因为他们的出入状态和出入日期相同。这段left join mytable t on t.coach = c.coach and t.entry_date &lt;= d.dt and t.exit_date &gt; d.dt 是否将它们排除在统计范围之外?
    猜你喜欢
    • 2018-05-03
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-06-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多