【问题标题】:Numpy float object not iterable while running a functionNumpy float 对象在运行函数时不可迭代
【发布时间】:2021-04-16 11:19:51
【问题描述】:

我正在尝试根据一些比较条件将名称分配给集群,但是我收到错误消息,指出 numpy 浮点对象不可迭代。此外,我不希望对数据集进行子集化,如下所示,从 df 到 df1 再将其连接起来。 下面是代码:

import pandas as pd
df = pd.DataFrame({'cluster':[0, 1, 2, 9999], 'earlypc':[88.943,4.034,6.839,0.488],'C':[3.491,8.306,75.329,34.5],'D':[14.548,87.66,17.832,65.012]})
df1=df[df['cluster']!=999]
def cluster(a,b,c,d):
    if(max(b)==b):
        return 'high'
    elif (max(c)==c):
         return 'low'
    elif (max(d)==d):
        return 'medium'
    else: return 'medium'

df1['Vendor_Segmentation']=df1.apply(lambda x:cluster(x['cluster'],x['earlypc'],x['C'],x['D']),axis=1)

TypeError: 'numpy.float64' object is not iterable

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    我赞同上述观点,不使用apply,并提供numpy 包中的两个替代方案,专为像您这样的情况而设计:

    import numpy as np
    import pandas as pd
    
    1. numpy.select
    # Specify the conditions
    conditions = [ 
        (df1['earlypc'] == df1['earlypc'].max()),
        (df1['C'] == df1['C'].max()),
        (df1['D'] == df1['D'].max())     
        ]
    
    # What each condition should return
    choices =['high','low','medium']
    
    # Return the array as a column
    df1['Vendor_segmentation'] = np.select(conditions, choices,default='medium')
    

    2numpy.where

    df1['Vendor_segmentation'] = np.where(df1['earlypc'].eq(df1['earlypc'].max()),'high',
                                  np.where(df1['C'].eq(df1['C'].max()),'low',
                                  np.where(df1['D'].eq(df1['D'].max()),'medium',
                                                                       'medium')))
    

    打印:

    Out[531]: 
    
       cluster  earlypc       C       D Vendor_segmentation
    0        0   88.943   3.491  14.548                high
    1        1    4.034   8.306  87.660              medium
    2        2    6.839  75.329  17.832                 low
    3     9999    0.488  34.500  65.012              medium
    

    Series.eq== 是等价的。

    【讨论】:

      【解决方案2】:

      完全不使用 apply 怎么样:

      df1['Vendor_Segmentation'] = 'medium'
      df1.loc[df1.earlypc==df1.earlypc.max(), "Vendor_Segmentation"] = 'high'
      df1.loc[df1.C==df1.C.max(), "Vendor_Segmentation"] = 'low'
      df1.loc[df1.D==df1.D.max(), "Vendor_Segmentation"] = 'medium'
      

      这给出了预期的结果。

          cluster earlypc C   D        Vendor_Segmentation
      0   0       88.943  3.491   14.548      high
      1   1       4.034   8.306   87.660      medium
      2   2       6.839   75.329  17.832      low
      3   9999    0.488   34.500  65.012      medium
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2019-03-02
        • 1970-01-01
        • 1970-01-01
        • 2023-04-06
        • 1970-01-01
        • 1970-01-01
        • 2021-11-27
        相关资源
        最近更新 更多