计算日期之间的空值答案

【问题标题】：Counting null values between dates计算日期之间的空值
【发布时间】：2019-09-30 20:06:52
【问题描述】：

我正在尝试计算日期之间空值的数量。

我的桌子是这样的：

transaction_date    transaction_sale
10/1/2018           NULL
11/1/2018           33
12/1/2018           NULL
1/1/2019            NULL
2/1/2019            NULL
3/1/2019            2
4/1/2019            NULL
5/1/2019            NULL
6/1/2019            10

我希望得到以下输出：

transaction_date    transaction_sale   count
10/1/2018           NULL               NULL
11/1/2018           33                 1
12/1/2018           NULL               NULL
1/1/2019            NULL               NULL
2/1/2019            NULL               NULL
3/1/2019            2                  3
4/1/2019            NULL               NULL
5/1/2019            NULL               NULL
6/1/2019            10                 2

【问题讨论】：

听起来像 lag 函数下滚计算

标签： sql postgresql window-functions gaps-and-islands

【解决方案1】：

count(expression) 不计算 NULL 值，无论是作为聚合函数还是作为窗口函数。 The manual:

表达式的值不为空的输入行数

这是简单快速查询的关键要素。

假设 transaction_date 是 UNIQUE 就像您的示例所暗示的那样，或者您必须定义如何打破重复值之间的联系。（实际的表定义会澄清。）

SELECT transaction_date, transaction_sale
     , CASE WHEN transaction_sale IS NOT NULL
            THEN count(*) OVER (PARTITION BY grp) - 1
       END AS count 
FROM  (
   SELECT *
        , count(transaction_sale) OVER (ORDER BY transaction_date DESC) AS grp
   FROM   tbl
   ) sub
ORDER  BY transaction_date;

子查询中的表单组。由于每个非空值都会根据您的定义启动一个新组，因此只需在窗口函数中按降序计算实际值即可有效地为每一行分配一个组号。其余的都是微不足道的。

在外部SELECT 中，计算每组的行数并显示transaction_sale IS NOT NULL 的位置。按 1 修正。瞧。

相关：

Select longest continuous sequence

或者，用 FILTER (WHERE transaction_sale IS NULL) 计数 - 对于我们不能简单地减去 1 的相关情况很有用：

SELECT transaction_date, transaction_sale
     , CASE WHEN transaction_sale IS NOT NULL
            THEN count(*) FILTER (WHERE transaction_sale IS NULL)
                          OVER (PARTITION BY grp)
       END AS count 
FROM  (
   SELECT *
        , count(transaction_sale) OVER (ORDER BY transaction_date DESC) AS grp
   FROM   tbl
   ) sub
ORDER  BY transaction_date;

关于FILTER 子句：

How can I simplify this game statistics query?

db小提琴here

【讨论】：

【解决方案2】：

这不会对连续日期等做出任何假设。

with data as (
    select transaction_date, transaction_sale,
        count(transaction_sale)
            over (order by transaction_date desc) as grp
    from T /* replace with your table */
)
select transaction_date, transaction_sale,
    case when transaction_sale is null then null else
        count(case when transaction_sale is null then 1 end)
            over (partition by grp) end as "count"
from data
order by transaction_date;

在此处查看演示。虽然演示是 SQL Server，但它应该在您的平台上同样工作：https://rextester.com/GVR65084

另见 PostgreSQL：http://sqlfiddle.com/#!15/07c85f/1

【讨论】：

@jarlh 它也适用于 PostgreSQL。 SQLFiddle 当时已关闭。

【解决方案3】：

如果日期是连续的，可以使用下面的方法来获取上一个日期：

select t.*,
       max(transaction_date) filter where (transaction_sale is not null) over (order by transaction_date order by transaction date rows between unbounded preceding and 1 preceding)
from t;

如果差值小于12，可以使用age()和extract()：

select t.*,
       extract(month from
               age(max(transaction_date) filter where (transaction_sale is notnull)
                       over (order by transaction_date order by transaction date rows between unbounded preceding and 1 preceding
                            ), transaction_date
                   )
               ) as diff

【讨论】：

【解决方案4】：

如果交易日期是一个日期字段，您可以简单地使用：

select count(*) from Counter where transaction_date > date_lower and transaction_date < date_higher and tx_sale is null;

【讨论】：