【问题标题】:Mysql Cumulative Sum grouped by name and dateMysql Cumulative Sum 按名称和日期分组
【发布时间】:2022-01-07 08:48:10
【问题描述】:

我有一个任务,我应该累积两个不同的日期,我用以下查询完成了:

with cte as (
select p.starteddate as CD,
(select count(*) from TABLE t1 where p.starteddate=t1.starteddate) Started,
(select count(*) from TABLE t1 where p.starteddate=t1.updateddate) Updated
from TABLE p
group by CD, Started, Updated
)
select
 CD,
  sum(Started) over (order by CD asc rows between unbounded preceding and current row) as Started,
  sum(Updated) over (order by CD asc rows between unbounded preceding and current row) as Updated
from cte
order by CD desc;

现在,我必须在该查询中添加更多值,我被卡住了。

  1. 查询应考虑名称,并且仅按每个名称累积日期。由于有 11 个左右的名字,我想最好的选择是先用一个不同的查询来检索名字,但我不知道如何在这里更进一步。
  2. 只应考虑 A 类
  3. 此外,由于有 1000 行并且还在计数,并且日期可以追溯到一年,因此只应考虑过去 60 天。

谢谢! Example Table

Type Startdate updateddate name
A 12/01/21 12/01/21 D
A 11/01/21 12/01/21 D
A 13/01/21 13/01/21 E
A .
A 07/01/21 11/01/21 E
A 12/01/21 14/01/21 E
A .
A 14/01/21 14/01/21 G
A 12/01/21 12/01/21 D
A 11/01/21 12/01/21 D
A 13/01/21 13/01/21 E
A .
A 07/01/21 11/01/21 E
A 12/01/21 14/01/21 E
A 14/01/21 null G
A 11/01/21 11/01/21 F
A 14/01/21 15/01/21 G

Expected Outcome

Name Date Count Start Count Updated
E 07/01/21 2 0
E 11/01/21 2 2
E 12/01/21 4 2
E 13/01/21 6 4
D 11/01/21 2 0
D 12/01/21 4 4
G 14/01/21 3 1
G 15/01/21 3 2
F 11/01/21 1 1

【问题讨论】:

  • 请将示例数据和预期结果添加为我们可以使用的文本,而不是我们不能使用的链接图像。
  • 公平点,我用唯一相关的 A 类型编辑了它

标签: mysql group-by common-table-expression cumulative-sum


【解决方案1】:

您可以尝试以下方法:

  1. 对数据集执行联合以确保考虑所有日期。在创建此联合的第一个 CTE t1 中,我们通过 Type 应用您的过滤器,并在过去 60 天内使用 date >= (CURRENT_DATE-interval 60 day) (此示例已被注释掉)
  2. 对于所有可能的日期,CTE t2 执行之前的计数
  3. 使用窗口函数SUM 查找上次投影中计数的累积和
    WITH t1 as (
        SELECT
            name,
            startdate as `date`,
            'start' as datetype
        FROM
            my_table
        WHERE
            Type='A' -- AND startdate >= (CURRENT_DATE-interval 60 day)
        UNION ALL
        SELECT
            name,
            updateddate as `date`,
            'updated' as datetype
        FROM
            my_table
        WHERE
            Type='A' -- AND updateddate >= (CURRENT_DATE-interval 60 day)
    ),
    t2 as (
        SELECT 
            name,
            `date`,
            COUNT(
               CASE
                   WHEN datetype='start' THEN 1 
               END
            ) as cstart,
            COUNT(
               CASE
                   WHEN datetype='updated' THEN 1 
               END
            ) as cend
         FROM
             t1
         WHERE `date` IS NOT NULL
         GROUP BY
             name,
             `date`
     )
    SELECT
        name as `Name`,
        `date` as `Date`,
        SUM(cstart) OVER (
            PARTITION BY name
            ORDER BY `date`
        ) as `Count Start`,
        SUM(cend) OVER (
            PARTITION BY name
            ORDER BY `date`
        )  as `Count Updated`
    FROM
       t2
    ORDER BY
       name, `date`;
Name Date Count Start Count Updated
D 2021-01-11 2 0
D 2021-01-12 4 4
E 2021-01-07 2 0
E 2021-01-11 2 2
E 2021-01-12 4 2
E 2021-01-13 6 4
E 2021-01-14 6 6
F 2021-01-11 1 1
G 2021-01-14 3 1
G 2021-01-15 3 2

View working demo online on DB Fiddle

【讨论】:

  • 哇。谢谢!这正是我打算做的。我只是验证了这些值,它的工作原理就像预期的那样。现在我将更仔细地研究代码。谢谢!
  • 抱歉,我只是想,输出需要稍微调整一下。我将使用您的表格的输出来编辑我的问题。
猜你喜欢
  • 1970-01-01
  • 2020-08-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2013-08-08
  • 2018-03-13
  • 2018-12-18
  • 1970-01-01
相关资源
最近更新 更多