【问题标题】:multiple Elseif conditions in Pandas Dataframe over multiple columnsPandas Dataframe 中多列的多个 Elseif 条件
【发布时间】:2021-03-18 16:27:21
【问题描述】:

我在对大型数据集应用多个 Elif 条件时遇到问题。样本数据如下:

Id = ['AM12','AM21','AM31','AM41','AM66','AM81','AM77','AM87','AM27','AM69']
Exec = ['Athreyu','Megan','','Omar','Michael','','Oliver','','Jesus','']
AD_Executive= ['','Fer','Virat','John','','John','','','pandya','John']
Ex_FiscId= ['John','Sonal','','Ram','','Anthony','','','Sriju','']
full_nm = ['pulari','','Burgers','Saheb','Bhavya','Borah','Dutta','Upinder','Ruhaan','Rochan']
df_ex = pd.DataFrame(list(zip(Id, full_nm,Exec,AD_Executive,Ex_FiscId)), 
           columns =['Id', 'full_nm','Exec','AD_Executive','Ex_FiscId'])

我想为最终名称创建一个新列。我申请的条件是 -

def final(df_ex):
   if df_ex['Ex_FiscId'] != np.NaN:
    return df_ex['Ex_FiscId']
  elif (df_ex['Ex_FiscId'] == np.NaN) & (df_ex['AD_Executive'] != np.NaN):
    return df_ex['AD_Executive']
  elif (df_ex['Ex_FiscId'] == np.NaN) & (df_ex['AD_Executive'] == np.NaN) & (df_ex['Exec'] !=np.NaN):
    return df['Exec']
  elif (df_ex['Ex_FiscId'] == np.NaN) & (df_ex['AD_Executive'] == np.NaN) & (df_ex['Exec'] ==np.NaN):
    return df_ex['full_nm']

df_ex['Final'] = df_ex.apply(final, axis = 1)

但它没有产生所需的输出。该代码似乎只读取了第一个 if 条件而忽略了其他条件。

我还附上了输入和所需输出表以供参考-

Id full_nm Exec AD_Executive Ex_FiscId
AM12 pulari Athreyu John
AM21 Megan Fer Sonal
AM31 Burgers Virat
AM41 Saheb Omar John Ram
AM66 Bhavya Michael
AM81 Borah John Anthony
AM77 Dutta Oliver
AM87 Upinder
AM27 Ruhaan Jesus pandya Sriju
AM69 Rochan John

期望的输出 -

Id full_nm Exec AD_Executive Ex_FiscId Final
AM12 pulari Athreyu John John
AM21 Megan Fer Sonal Sonal
AM31 Burgers Virat Virat
AM41 Saheb Omar John Ram Ram
AM66 Bhavya Michael Michael
AM81 Borah John Anthony Anthony
AM77 Dutta Oliver Oliver
AM87 Upinder Upinder
AM27 Ruhaan Jesus pandya Sriju Sriju
AM69 Rochan John John

【问题讨论】:

  • 所有内容,包括np.NaN,都是!= np.NaN,所以你永远不会超过第一个块。您应该使用pd.isna() 检查某些内容是否为空。但实际上你应该使用np.select 实现所有这些逻辑:参见stackoverflow.com/questions/18194404/…

标签: pandas data-manipulation


【解决方案1】:

让我们试试这个:

df['final'] = df_ex.apply(lambda x: x[x[::-1].notnull().idxmax()], axis=1)

输出:

    Id  full_nm     Exec    AD_Executive    Ex_FiscId   final
0   AM12    pulari  Athreyu     NaN            John     John
1   AM21    NaN     Megan       Fer            Sonal    Sonal
2   AM31    Burgers NaN         Virat          NaN      Virat
3   AM41    Saheb   Omar        John           Ram      Ram
4   AM66    Bhavya  Michael     NaN            NaN      Michael
5   AM81    Borah   NaN         John           Anthony  Anthony
6   AM77    Dutta   Oliver      NaN            NaN      Oliver
7   AM87    Upinder NaN         NaN            NaN      Upinder
8   AM27    Ruhaan  Jesus       pandya         Sriju    Sriju
9   AM69    Rochan  NaN         John           NaN      John

【讨论】:

    【解决方案2】:
    def final(df_ex):
    if df_ex['Ex_FiscId'] !="":
        return df_ex['Ex_FiscId']
    elif (df_ex['Ex_FiscId']=="") & (df_ex['AD_Executive'] !=""):
        return df_ex['AD_Executive']
    elif (df_ex['Ex_FiscId']=="") & (df_ex['AD_Executive'] =="") & (df_ex['Exec'] !=""):
        return df_ex['Exec']
    elif (df_ex['Ex_FiscId']=="") & (df_ex['AD_Executive'] =="") & (df_ex['Exec'] ==""):
        return df_ex['full_nm']
    
    
    df_ex['Final'] = df_ex.apply(final, axis = 1)
    

    【讨论】:

      猜你喜欢
      • 2019-02-14
      • 1970-01-01
      • 2017-03-31
      • 2019-05-19
      • 1970-01-01
      • 1970-01-01
      • 2021-04-18
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多