放置apply方法的问题答案

【问题标题】：Problems with putting the apply method放置apply方法的问题
【发布时间】：2022-01-25 14:38:47
【问题描述】：

def mean1(x):
    return sum(x)/len(x)

df2['children'] = df2['children'].apply(mean1)

我得到的错误如下：- 'int' 对象不可迭代

我认为我正确地应用了 Apply() 函数。但还是报错。

【问题讨论】：

标签： python pandas numpy apply

【解决方案1】：

您应该在列上应用mean1，而不是在项目上：

df2['children'] = mean1(df2['children'])

或者更好的是，使用 pandas 内置的 mean 方法：

df2['children'] = df2['children'].mean()

【讨论】：

【解决方案2】：

带有示例数据框

In [372]: df
Out[372]: 
   0    1   2   3
1  0    1   2   3
2  4  100   6   7
3  8    9  10  11
In [373]: df[1]     # one column
Out[373]: 
1      1
2    100
3      9
Name: 1, dtype: int64

和你的函数 - 修改以显示 x 得到了什么：

In [375]: def mean1(x):
     ...:     print(x)
     ...:     return sum(x)/len(x)
     ...: 
In [376]: df[1].apply(mean1)
1
Traceback (most recent call last):
  File "<ipython-input-376-e12f9dfea5ae>", line 1, in <module>
    df[1].apply(mean1)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/series.py", line 4357, in apply
    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/apply.py", line 1043, in apply
    return self.apply_standard()
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/apply.py", line 1099, in apply_standard
    mapped = lib.map_infer(
  File "pandas/_libs/lib.pyx", line 2859, in pandas._libs.lib.map_infer
  File "<ipython-input-375-48efb527b53e>", line 3, in mean1
    return sum(x)/len(x)
TypeError: 'int' object is not iterable

看到x 是1，一个数字。 Python 不能在1 上执行sum 和len。错误不在apply 中，而是在您的函数中，该函数不是用单个数字编写的。

你打算做什么？取整列的平均值？还是每个单元格中数组或列表的平均值？

In [378]: mean1(df[1])
1      1
2    100
3      9
Name: 1, dtype: int64
Out[378]: 36.666666666666664

apply 如果数据框列包含列表或数组，您的函数将起作用

In [386]: df = pd.DataFrame([None,None,None],columns=['one'])
In [387]: df['one'] = [np.ones(5).tolist(),np.arange(4).tolist(),np.zeros(9).tol
     ...: ist()]
In [388]: df
Out[388]: 
                                             one
0                      [1.0, 1.0, 1.0, 1.0, 1.0]
1                                   [0, 1, 2, 3]
2  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
In [389]: df['one'].apply(mean1)
[1.0, 1.0, 1.0, 1.0, 1.0]
[0, 1, 2, 3]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
Out[389]: 
0    1.0
1    1.5
2    0.0
Name: one, dtype: float64

【讨论】：

非常感谢您的解决方案，它是正确的，并澄清了我的直觉。