【问题标题】:how to use and apply function with pandas data frame in python如何在 python 中使用和应用带有 pandas 数据框的函数
【发布时间】:2021-12-29 00:32:15
【问题描述】:

我写了一个函数,它可以接受四个输入值并根据它产生结果

def python_function(a, b, c, d):
    if [a, b, c, d].count(0) == 4:
        return "NA"

    average = (a + b + c + d) / (4 - [a, b, c, d].count(0))

    # change to a for q1, b for q2, c for q3, d for q4
    if c >= average:
        if c > b:
            return "G"
        else:
            return "S"
    elif c < average:
        return "B"

    return "NA"

调用上述函数:

python_function(5.3,9.7,.4,0)

'B'

python_function(5.3,9.7,10.4,0)

'G

但是,当我们对 pandas 数据框的列应用相同的函数时,我们会遇到错误,我确信有一种方法可以处理逻辑运算符的浮点值,但我不知道该怎么做

数据框:

   q1_profit    q2_profit   q3_profit   q4_profit
0   89969.7     112896.7    25665.4     0
1   1.6         459.9       295.9       0
2   0.9         9.5         5.3         0
3   1396.1      1105.2      0.2         0
4   17.9        365.5       191.1       0

数据类型:

q1_profit            1600 non-null float64
q2_profit            1600 non-null float64
q3_profit            1600 non-null float64
q4_profit            1600 non-null int64




 data["rating"] = python_function(data["q1_profit"],data["q2_profit"],data["q3_profit"],data["q4_profit"])

error_messages

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-51-6dba2870dd9c> in <module>
----> 1 data["rating"] = python_function(data["q1_profit"],data["q2_profit"],data["q3_profit"],data["q4_profit"])

<ipython-input-39-47792387b172> in python_function(a, b, c, d)
      1 def python_function(a, b, c, d):
----> 2     if [a, b, c, d].count(0) == 4:
      3         return "NA"
      4 
      5     average = (a + b + c + d) / (4 - [a, b, c, d].count(0))

~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1476         raise ValueError("The truth value of a {0} is ambiguous. "
   1477                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1478                          .format(self.__class__.__name__))
   1479 
   1480     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

【问题讨论】:

    标签: python pandas user-defined-functions


    【解决方案1】:

    您似乎正在对dataframe 的每一行进行操作。所以我看到了使用apply 函数的最佳选择。

    input_data = {
         'q1_profit':[89969.7,1.6,0.9,1396.1 ,17.9 ],
         'q2_profit':[112896.7, 459.9,9.5,1105.2 , 365.5],
         'q3_profit' :[25665.4,295.9 ,5.3,0.2, 191.1],
         'q4_profit':[0,0,0,0,0]
          }
    
    import pandas as pd 
    data = pd.DataFrame(data=input_data)
     
    data['rating'] = data.apply(lambda row: python_function(row["q1_profit"],row["q2_profit"],row["q3_profit"],row["q4_profit"]), axis=1)
    
    print(data)
    

    输出:

       q1_profit  q2_profit  q3_profit  q4_profit rating
    0    89969.7   112896.7    25665.4          0      B
    1        1.6      459.9      295.9          0      S
    2        0.9        9.5        5.3          0      S
    3     1396.1     1105.2        0.2          0      B
    4       17.9      365.5      191.1          0      B
    

    【讨论】:

      【解决方案2】:
      data["rating"]  = data.apply(lambda x : python_function(x.q1_profit,x.q2_profit,x.q3_profit,x.q4_profit),
               axis =1)
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2013-02-09
        • 1970-01-01
        • 2022-08-21
        • 2017-09-07
        • 1970-01-01
        • 2018-12-15
        • 2014-10-21
        • 1970-01-01
        相关资源
        最近更新 更多