【问题标题】:Conditional ROW_NUMBER() to skip nulls still counting them有条件的 ROW_NUMBER() 跳过仍然计算它们的空值
【发布时间】:2018-11-19 15:33:47
【问题描述】:

我正在尝试将 ROW_NUMBER 列添加到遵循 [Withdrawal then Fee; 顺序的数据集; Withdrawal then Fee] 基于NETWORKCD 列是否为空。但是,我的 row_number 公式仍在计算 Null,即使它没有显示数字,如下所示:

+--------+-------------+------------+-----------+-------------+
| Acctid | Transaction |  PostDate  | NetworkCd | PeriodCount |
+--------+-------------+------------+-----------+-------------+
|  12345 | Withdrawal  | 10/4/2018  | FRGN      |           1 |
|  12345 | Fee         | 10/4/2018  |           |             |
|  12345 | Withdrawal  | 10/11/2018 | FRGN      |           3 |
|  12345 | Fee         | 10/11/2018 |           |             |
|  12345 | Withdrawal  | 10/22/2018 | FRGN      |           5 |
|  12345 | Fee         | 10/22/2018 |           |             |
+--------+-------------+------------+-----------+-------------+

我对@9​​87654323@使用以下公式

(case when networkcd is not null 
      then row_number() over (partition by acctid order by postdate) 
 end) PeriodCount

我希望 PeriodCount 计数 [1,2,3] 而不是 [1,3,5],如下所示:

+--------+-------------+------------+-----------+-------------+
| Acctid | Transaction |  PostDate  | NetworkCd | PeriodCount |
+--------+-------------+------------+-----------+-------------+
|  12345 | Withdrawal  | 10/4/2018  | FRGN      |           1 |
|  12345 | Fee         | 10/4/2018  |           |             |
|  12345 | Withdrawal  | 10/11/2018 | FRGN      |           2 |
|  12345 | Fee         | 10/11/2018 |           |             |
|  12345 | Withdrawal  | 10/22/2018 | FRGN      |           3 |
|  12345 | Fee         | 10/22/2018 |           |             |
+--------+-------------+------------+-----------+-------------+

我错过了什么?

【问题讨论】:

  • row_number() 正是在这种情况下,行号,无论您选择显示它还是在 case 语句中用 null 替换它,都不会改变它是哪个行号。
  • @Andrew 这是我的第一个想法,但 networkcd 的 case 语句不是为 null 调整的吗?
  • 别担心,Alex 发布的 Dense_Rank 是前进的方向,其工作方式的细微差别

标签: sql plsql oracle11g


【解决方案1】:

它仍然在第 1、第 3 和第 5 行显示一个值 - 因此遇到这些行中的每一行时的行数都是正确的。它是整个结果集中的行号,而不是非空值中的行号。您的 case 表达式决定是否显示该值,而不影响如何找到它。

您可以改用dense_rank() 获得您想要的结果;

with your_table (Acctid, Transaction, PostDate, NetworkCd) as (
          select 12345, 'Withdrawal', to_date('10/4/2018', 'MM/DD/YYYY'), 'FRGN' from dual
union all select 12345, 'Fee', to_date('10/4/2018', 'MM/DD/YYYY'), null from dual
union all select 12345, 'Withdrawal', to_date('10/11/2018', 'MM/DD/YYYY'), 'FRGN' from dual
union all select 12345, 'Fee', to_date('10/11/2018', 'MM/DD/YYYY'), null from dual
union all select 12345, 'Withdrawal', to_date('10/22/2018', 'MM/DD/YYYY'), 'FRGN' from dual
union all select 12345, 'Fee', to_date('10/22/2018', 'MM/DD/YYYY'), null from dual
)
select Acctid, Transaction, PostDate, NetworkCd,
(case when networkcd is not null then dense_rank() over (partition by acctid order by postdate) end) as periodcount
from your_table;

    ACCTID TRANSACTIO POSTDATE   NETW PERIODCOUNT
---------- ---------- ---------- ---- -----------
     12345 Withdrawal 2018-10-04 FRGN           1
     12345 Fee        2018-10-04                 
     12345 Withdrawal 2018-10-11 FRGN           2
     12345 Fee        2018-10-11                 
     12345 Withdrawal 2018-10-22 FRGN           3
     12345 Fee        2018-10-22                 

... 因为这确实抑制了生成值中的差距。 From the docs:

DENSE_RANK 计算行在有序行组中的排名,并将排名作为NUMBER 返回。秩是从 1 开始的连续整数。 ...

【讨论】:

  • 完美运行,谢谢。我会花一些时间研究 Dense_rank()。
【解决方案2】:

尝试将partition by acctid 更改为partition by acctid, NetworkCd IS NOT NULL

【讨论】:

    【解决方案3】:

    Alex Poole 的答案有效,但取决于您的空行与您要计算的非空行具有相同的 POSTDATE 的事实。如果您将日期更改为彼此不同,您会看到 dense_rank 失败:

    with your_table (Acctid, Transaction, PostDate, NetworkCd) as (
                  select 12345, 'Withdrawal', to_date('10/1/2018', 'MM/DD/YYYY'), 'FRGN'
        union all select 12345, 'Fee', to_date('10/2/2018', 'MM/DD/YYYY'), null
        union all select 12345, 'Withdrawal', to_date('10/3/2018', 'MM/DD/YYYY'), 'FRGN'
        union all select 12345, 'Fee', to_date('10/4/2018', 'MM/DD/YYYY'), null
        union all select 12345, 'Withdrawal', to_date('10/5/2018', 'MM/DD/YYYY'), 'FRGN'
        union all select 12345, 'Fee', to_date('10/6/2018', 'MM/DD/YYYY'), null
        )
    
    select
        Acctid,
        Transaction,
        PostDate,
        NetworkCd,
        case when networkcd is not null then dense_rank() over (partition by Acctid order by PostDate) end as PeriodCount,
        dense_rank() over (partition by Acctid order by PostDate) as DenseRank
    from your_table
    order by Acctid, PostDate, Transaction
    ;
    

    产量

    acctid  transaction  postdate    networkcd  periodcount  denserank
    ------  -----------  ----------  ---------  -----------  ---------
    12345   Withdrawal   2018-10-01  FRGN       1            1
    12345   Fee          2018-10-02                          2
    12345   Withdrawal   2018-10-03  FRGN       3            3
    12345   Fee          2018-10-04                          4
    12345   Withdrawal   2018-10-05  FRGN       5            5
    12345   Fee          2018-10-06                          6
    

    (在右侧,显示不带 null 条件的 dense_rank 的结果)因此它适用于您的情况,但总体而言并不可靠。

    Kevin Siemons 的回答效果更好:它分别计算空行和非空行,您可以简单地隐藏对空行进行的计数:

    with your_table (Acctid, Transaction, PostDate, NetworkCd) as (
                  select 12345, 'Withdrawal', to_date('10/1/2018', 'MM/DD/YYYY'), 'FRGN'
        union all select 12345, 'Fee', to_date('10/2/2018', 'MM/DD/YYYY'), null
        union all select 12345, 'Withdrawal', to_date('10/3/2018', 'MM/DD/YYYY'), 'FRGN'
        union all select 12345, 'Fee', to_date('10/4/2018', 'MM/DD/YYYY'), null
        union all select 12345, 'Withdrawal', to_date('10/5/2018', 'MM/DD/YYYY'), 'FRGN'
        union all select 12345, 'Fee', to_date('10/6/2018', 'MM/DD/YYYY'), null
        )
    
    select
        Acctid,
        Transaction,
        PostDate,
        NetworkCd,
        case when networkcd is not null then row_number() over (partition by Acctid, NetworkCd is not null order by PostDate) end as PeriodCount,
        row_number() over (partition by Acctid, NetworkCd is not null order by PostDate) as RowNumber
    from your_table
    order by Acctid, PostDate, Transaction
    ;
    

    产量

    acctid  transaction  postdate    networkcd  periodcount  rownumber
    ------  -----------  ----------  ---------  -----------  ---------
    12345   Withdrawal   2018-10-01  FRGN       1            1
    12345   Fee          2018-10-02                          1
    12345   Withdrawal   2018-10-03  FRGN       2            2
    12345   Fee          2018-10-04                          2
    12345   Withdrawal   2018-10-05  FRGN       3            3
    12345   Fee          2018-10-06                          3
    

    我建议使用 case when networkcd is not null then row_number() over (partition by Acctid, NetworkCd is not null order by PostDate) end as PeriodCount

    【讨论】:

      【解决方案4】:

      ROW_NUMBER() 函数无法按预期工作,但您可以改为:

      select t.*,
             (select count(*)
              from table t1
              where t1.acctid = t.acctid and t1.PostDate <= t.PostDate and
                    t1.networkcd is not null 
             ) as PeriodCount
      from table t;
      

      【讨论】:

        【解决方案5】:

        看起来它不是 NULL,而是一个空字符串。 SQL 将在代码中以不同的方式看待这一点。

        试试:

        case when networkcd = '' then row_number()....
        

        【讨论】:

        猜你喜欢
        • 2022-07-11
        • 2011-10-15
        • 2020-12-29
        • 1970-01-01
        • 2019-10-04
        • 1970-01-01
        • 2023-03-17
        • 2011-09-20
        • 1970-01-01
        相关资源
        最近更新 更多