【问题标题】:data frame select expression inside withcolumnwithcolumn 内的数据框选择表达式
【发布时间】:2021-05-25 20:41:15
【问题描述】:

我正在尝试通过使用 case 语句添加列来创建数据框。下面是代码sn-p。

orders=ordDtConvDF.withColumn('Status',\
ordDtConvDF.selectExpr('case 
when order_status in ("CLOSED","COMPLETE") then "completed" \
when order_status="PENDING_PAYMENT" then "Pending" \
else "Processing/canceled" end' ) ) 

它给出的错误如下。感谢您的帮助。

AssertionError                            Traceback (most recent call last)
<ipython-input-104-e633b3a604b9> in <module>
----> 1 orders=ordDtConvDF.withColumn('Status',ordDtConvDF.selectExpr('case when order_status in ("CLOSED","COMPLETE") then "completed" \
      2                              when order_status="PENDING_PAYMENT" then "Pending" \
      3                        else "Processing/canceled" end' ) )

/opt/anaconda3/lib/python3.8/site-packages/pyspark/sql/dataframe.py in withColumn(self, colName, col)
   2452 
   2453         """
-> 2454         assert isinstance(col, Column), "col should be Column"
   2455         return DataFrame(self._jdf.withColumn(colName, col._jc), self.sql_ctx)
   2456 

AssertionError: col should be Column

【问题讨论】:

    标签: apache-spark pyspark


    【解决方案1】:

    试试这个而不是selectExpr

    from pyspark.sql import functions as F
    
    orders = (ordDtConvDF.withColumn('Status', F
        .when(F.col('order_status').isin('CLOSED', 'COMPLETE'), 'completed')
        .when(F.col('order_status').isin('PENDING_PAYMENT'), 'Pending')
        .otherwise('Processing/canceled')
    ))
    

    【讨论】:

      【解决方案2】:

      使用F.expr 代替selectExpr

      orders = ordDtConvDF.withColumn(
          'Status',
          F.expr("""
              case when order_status in ("CLOSED","COMPLETE") then "completed"
                   when order_status = "PENDING_PAYMENT" then "Pending"
                   else "Processing/canceled" 
              end
          """ 
          ) 
      ) 
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2012-11-18
        • 1970-01-01
        • 2019-11-23
        • 1970-01-01
        • 2012-03-20
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多