【问题标题】:How to perform a nested When Otherwise in PySpark?如何在 PySpark 中执行嵌套的 When else ?
【发布时间】:2020-10-09 16:46:04
【问题描述】:

大家好我正在尝试解释这个 PowerBi 语法并将其转换为 Pyspark

 if(UCS_Incidents[Intensity]="Very High",
 IF(UCS_Incidents[Severity]="Very High","Red",
 IF(UCS_Incidents[Severity]="High","Red",
 IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),

 if(UCS_Incidents[Intensity]="High",
 IF(UCS_Incidents[Severity]="Very High","Red",
 IF(UCS_Incidents[Severity]="High","Orange",
 IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),

 if(UCS_Incidents[Intensity]="Medium",
 IF(UCS_Incidents[Severity]="Very High","Orange",
 IF(UCS_Incidents[Severity]="High","Yellow",
 IF(UCS_Incidents[Severity]="Medium","Yellow","Green"))),

 if(UCS_Incidents[Intensity]="Low",
 IF(UCS_Incidents[Severity]="Very High","Yellow",
 IF(UCS_Incidents[Severity]="High","Green",
 IF(UCS_Incidents[Severity]="Medium","Green","Green"))),

 ""))))

这就是我尝试过的:

 Intensities = df.withColumn(('Intensities',f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Very High') , "Red").
                        otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'High') , "Red").
                        otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Medium') , "Orange")
                        .otherwise('Yellow'))))
                        .otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Very High') , "Red").
                        otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'High') , "Orange").
                        otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Medium') , "Orange")
                        .otherwise('Yellow'))))
                        .otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Very High') , "Orange").
                        otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'High') , "Yellow").
                        otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Medium') , "Yellow")
                        .otherwise('Green'))))
                        .otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Very High') , "Yellow").
                        otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'High') , "Green").
                        otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Medium') , "Green")
                        .otherwise('Green'))))

                        ).otherwise("")

但是,我得到了这个错误:

  A Tuple Object dosen't have an attribute Otherwise

任何帮助将不胜感激,谢谢

【问题讨论】:

  • 尝试将嵌入式ifs 逻辑转换为嵌入式SQL 的case/when 语句,然后使用f.expr() 函数检索结果。

标签: if-statement pyspark case-when


【解决方案1】:

只是举例说明@jxc 的含义: 假设您已经有一个名为 df 的数据框:

from pyspark.sql.functions import expr

Intensities = df.withColumn('Intensities', expr("CASE WHEN Intensity = 'Very High' AND Severity = 'Very High' THEN 'Red' WHEN .... ELSE ... END"))

我把“...”作为占位符,但我认为它使方法更清晰。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2016-09-15
    • 2021-12-04
    • 1970-01-01
    • 2019-08-16
    • 2020-05-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多