【问题标题】:How to replace nulls with zeros in pivot query sql for fact table in Databricks如何在 Databricks 中的事实表的数据透视查询 sql 中用零替换空值
【发布时间】:2021-08-06 16:47:05
【问题描述】:

我看到了很多关于如何在有列被查询的情况下执行此操作的解决方案,包括以下内容...

how to Replace null with zero in pivot SQL query

Oracle 11g SQL - Replacing NULLS with zero where query has PIVOT

Replacing null values in dynamic pivot sql query

等等,等等,等等,

但是,当您为条件的存在创建事实表时,如何替换数据透视查询中的空值。

例如,在 Databricks 中: 如何替换以下内容的空值

设置

drop table if exists patient_dx;

create table patient_dx (patient_id string, dx string);

insert into patient_dx values
  ('Bob', 'cough'),
  ('Donna', 'cough'),
  ('Jerry', 'cough'),
  ('Bob', 'feaver'),
  ('Donna', 'head ache')
;

查询:

select * from (
  select
    patient_id,
    dx,
    cast (1 as int) cnt
  from
    patient_dx
)
pivot (
  max(cnt)
  for dx in ('cough','feaver','head ache')
)
;

结果

我尝试了以下几种排列方式:

cast(0 + cast(coalesce(sum(coalesce(cnt,0)),0) as int) as int) as cnt

无济于事

【问题讨论】:

    标签: apache-spark-sql pivot pivot-table databricks


    【解决方案1】:

    您必须使用 coalesce 或 NOT NULL 来替换选择查询中的空值。

    如果有帮助,请检查以下内容:


    试试这个:

    spark.sql("""
    select
     patient_id,
     CASE 
     when cough is NOT NULL THEN cough
     else 0
     END as cough,
     CASE 
     when feaver is NOT NULL THEN feaver
     else 0
     END as feaver,
     CASE 
     when `head ache` is NOT NULL THEN `head ache`
     else 0
     END as `head ache`
     from ( 
    select * from patient
    )
    PIVOT(
      Count(dx)
      for dx in ('cough','feaver','head ache')
    )
    ;
    """).show()
    

    输出将是:

    patient_id cough feaver head ache
    Donna 1 0 1
    Jerry 1 0 0
    Bob 1 1 0

    如果你希望它是动态的

    dist=spark.sql("select collect_set(dx) from patient;").toPandas()
    val=spark.sql("""
    select
     patient_id,
     coalesce(cough,0) as `cough`,
     coalesce(feaver,0) as `feaver`,
     coalesce(`head ache`,0) as `head ache`
     from ( 
    select * from patient
    )
    PIVOT(
      Count(dx)
      for dx in """
    +
    str(tuple(map(tuple, *dist.values))[0])
    +
    """
    )
    ;
    """)
    

    【讨论】:

    • 这太棒了!!!我希望有一个纯 sql 解决方案,但我认为这至少可以解决我的一些问题。我会试一试,让你知道我是怎么做到的。谢谢!
    猜你喜欢
    • 1970-01-01
    • 2014-04-09
    • 2021-12-24
    • 1970-01-01
    • 1970-01-01
    • 2022-08-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多