【问题标题】:How to get maximum column values across a row in Teradata sql?如何在 Teradata sql 中的一行中获取最大列值?
【发布时间】:2015-10-15 19:12:41
【问题描述】:

我有一个名为 cnst_chrctrstc_abc 的表,每行包含 10 列 (equ_gender1 - bb_population_flag),其中包含数值 (count)。 我想从这 10 个数字列中的每一行中获取最多 5 个值。

我的查询如下所示..

SEL 
FROM 
(
SEL 
SUM(CASE WHEN COALESCE(act.equ_gender1,'') = COALESCE(inact.equ_gender1,'') THEN 0 ELSE 1 END ) AS equ_gender1_chg_cnt,
SUM(CASE WHEN COALESCE(act.exp_ex_bmyr1,'') = COALESCE(inact.exp_ex_bmyr1,'') THEN 0 ELSE 1 END ) AS exp_ex_bmyr1_chg_cnt,
SUM(CASE WHEN COALESCE(act.equ_age1,'') = COALESCE(inact.equ_age1,'') THEN 0 ELSE 1 END ) AS equ_age1_chg_cnt,
SUM(CASE WHEN COALESCE(act.maritalstatus1,'') = COALESCE(inact.maritalstatus1,'') THEN 0 ELSE 1 END ) AS maritalstatus1_chg_cnt,
SUM(CASE WHEN COALESCE(act.person_type1,'') = COALESCE(inact.person_type1,'') THEN 0 ELSE 1 END ) AS person_type1_chg_cnt,
SUM(CASE WHEN COALESCE(act.homeowner,'') = COALESCE(inact.homeowner,'') THEN 0 ELSE 1 END ) AS homeowner_chg_cnt,
SUM(CASE WHEN COALESCE(act.dwelling_size,'') = COALESCE(inact.dwelling_size,'') THEN 0 ELSE 1 END ) AS dwelling_size_chg_cnt,
SUM(CASE WHEN COALESCE(act.lengthofresidence,'') = COALESCE(inact.lengthofresidence,'') THEN 0 ELSE 1 END ) AS lengthofresidence_chg_cnt,
SUM(CASE WHEN COALESCE(act.childrenage0_18,'') = COALESCE(inact.childrenage0_18,'') THEN 0 ELSE 1 END ) AS childrenage0_18_chg_cnt,
SUM(CASE WHEN COALESCE(act.bb_population_flag,'') = COALESCE(inact.bb_population_flag,'') THEN 0 ELSE 1 END ) AS bb_population_flag


FROM
(SEL * FROM arc_mdm_Tbls.cnst_chrctrstc_abc WHERE load_id=1024 AND cnst_chrctrstc_end_dt='9999-12-31' (DATE))act
LEFT JOIN
(SEL * FROM arc_mdm_Tbls.cnst_chrctrstc_abc WHERE load_id=1024 AND cnst_chrctrstc_end_dt<'9999-12-31' (DATE) 
QUALIFY ROW_NUMBER() OVER (PARTITION BY cnst_mstr_id ORDER BY cnst_chrctrstc_strt_ts DESC)=1
)inact
ON act.cnst_mstr_id = inact.cnst_mstr_id
)X

我知道 SEL GREATEST 会在每一行中产生最大值。但我想要 5 个最高值并为它们分配一个排名。

对于某些行,前五列可能包含前 5 个值,而对于某些后五列,即 bb_population_flag 的房主可能包含前 5 个值。

所以如果来自 cnst_chrctrstc_abc 的列和值如下所示

cdi_batch_id | a | b | c | d | e | f | g | h | i |j
1024         |116|105|102|100|117|119|108|104|101|121

所以选择查询应该返回我列 j,f,e,a,g 具有前 5 个值。 然后我会相应地为他们分配一个等级。

应该使用 unpivot 或其他东西来完成吗? 提前致谢。

【问题讨论】:

  • 好的,很多问题:#1:您正在将行的当前版本与之前的版本进行比较(这可能可以使用 OLAP 函数重写)。 #2:您当前的查询返回单行,因此您可能需要单独的行而不是SUM。 #3:你的结果应该返回 5 行,每列应该有五个最高值?还是每列五行(也只有这一列或其他列)?如果有重复值怎么办?
  • 是的。第二个假设是正确的。实际上,我想选择前五个值,例如 SEL GREATEST( ----- ),然后将排名分配给这些列值并插入到其他表中。
  • 您可以添加示例数据和预期结果吗?
  • 请看一下。谢谢。
  • 最后一个问题:您的 Teradata 版本是什么?

标签: teradata


【解决方案1】:

是的,您需要取消透视结果。

在 TD14.10 之前,您将需要这些列名的列表,或者以表格形式

create table ColumnList (col varchar(128));
Insert into ColumnList('equ_gender1'       );
Insert into ColumnList('exp_ex_bmyr1'      );
Insert into ColumnList('equ_age1'          );
Insert into ColumnList('maritalstatus1'    );
Insert into ColumnList('person_type1'      );
Insert into ColumnList('homeowner'         );
Insert into ColumnList('dwelling_size'     );
Insert into ColumnList('lengthofresidence' );
Insert into ColumnList('childrenage0_18'   );
Insert into ColumnList('bb_population_flag');

或在你的飞行中使用笨重的

with ColumnList as 
 (
   select * from (select 'equ_gender1'        as Col) as dt union all
   select * from (select 'exp_ex_bmyr1'       as Col) as dt union all
   select * from (select 'equ_age1'           as Col) as dt union all
   select * from (select 'maritalstatus1'     as Col) as dt union all
   select * from (select 'person_type1'       as Col) as dt union all
   select * from (select 'homeowner'          as Col) as dt union all
   select * from (select 'dwelling_size'      as Col) as dt union all
   select * from (select 'lengthofresidence'  as Col) as dt union all
   select * from (select 'childrenage0_18'    as Col) as dt union all
   select * from (select 'bb_population_flag' as Col) as dt
 )

然后你 CROSS JOIN 到 unpivot:

select
   col,
   case col 
      when 'equ_gender1'        then equ_gender1       
      when 'exp_ex_bmyr1'       then exp_ex_bmyr1      
      when 'equ_age1'           then equ_age1          
      when 'maritalstatus1'     then maritalstatus1    
      when 'person_type1'       then person_type1      
      when 'homeowner'          then homeowner         
      when 'dwelling_size'      then dwelling_size     
      when 'lengthofresidence'  then lengthofresidence 
      when 'childrenage0_18'    then childrenage0_18   
      when 'bb_population_flag' then bb_population_flag
   end as Counts,
   rank() over (order by Counts desc) as rnk
FROM
 (
   your current select
 ) as dt
cross join ColumnList
qualify rnk <= 5

在 TD14.10 中,您可以使用 TD_UNPIVOT 函数:

SELECT Col, rank() over (order by Counts desc) as rnk 
from TD_UNPIVOT(
        ON (
             your current select
           )
        USING
           VALUE_COLUMNS('Counts')
           UNPIVOT_COLUMN('Col')
           COLUMN_LIST('equ_gender1'
                      ,'exp_ex_bmyr1'
                      ,'equ_age1'          
                      ,'maritalstatus1'
                      ,'person_type1'
                      ,'homeowner'
                      ,'dwelling_size'
                      ,'lengthofresidence'
                      ,'childrenage0_18'
                      ,'bb_population_flag')

        ) dt
qualify rnk <= 5;

编辑:

此外,您可以将 LEFT JOIN 替换为单个 OLAP 函数。根据每个cnst_mstr_id 的行数,这可能会更有效,因为无论如何您都需要ROW_NUMBER

SEL 
SUM(CASE WHEN COALESCE(equ_gender1,'') = COALESCE(last_equ_gender1,'') THEN 0 ELSE 1 END ) AS equ_gender1_chg_cnt,
...
FROM
 ( SELECT 
      min(equ_gender1) OVER (PARTITION BY cnst_mstr_id ORDER BY cnst_chrctrstc_strt_ts DESC rows between 1 following and 1 following) as equ_gender1,
      ...
   FROM arc_mdm_Tbls.cnst_chrctrstc_abc 
   WHERE load_id=1024
   qualify cnst_chrctrstc_end_dt= date '9999-12-31'
 )act

【讨论】:

  • 感谢您的解决方案。但是我不明白交叉连接部分
  • 它应该给我一个笛卡尔积对吗?它将如何帮助我转置表格。
  • 确切地说,它将返回 10 行,并且这种情况会在其中一行中取消旋转每一列,这是手动取消旋转的方法。在您的新问题中,您尝试使用多个 UNION ALL 进行相同的操作,结果相同,但创建它的开销要大得多...
猜你喜欢
  • 2022-01-23
  • 1970-01-01
  • 2019-07-13
  • 1970-01-01
  • 2011-01-07
  • 1970-01-01
  • 2012-02-14
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多