【问题标题】:int' object is not iterable", 'occurred at index iint' 对象不可迭代", '发生在索引 i
【发布时间】:2021-09-29 20:46:54
【问题描述】:

我正在尝试在包含形状“43722 行 × 62 列”的数据框“combined_sf2”中应用 .mean 函数。

我想从我的数据框中为每一行计算一系列不同属性中的一些值的平均值。然后生成一个名为“wkQtyEXTMean”的新属性/列,其中将包含每一行的所选属性值范围的平均值。

我尝试通过统计方法应用 .mean 函数创建以下函数:

    #function create to take the range of the selected attributes, if the sum is zero, so return the message 'thre is no mean', if not, calculate the mean
    
import statistics
    def wkQtyEXTMean(row):
        if (row['wk13QtyEXT']+row['wk12QtyEXT']) == 0:
            return 'No mean'
       else:
            return statistics.mean(row['wk13QtyEXT']+row['wk12QtyEXT'])

    #generating new column
    combined_sf2['wkQtyEXTMean'] = combined_sf2.apply(wkQtyEXTMean, axis=1)

但我收到以下错误:

("'int' object is not iterable", 'occurred at index 43721')

预期结果

有什么建议吗?

【问题讨论】:

  • 确保索引 43721(combined_sf2.iloc[43721]) 处的“wk13QtyEXT”和“wk12QtyEXT”列的数据类型正确。它们应该是一个整数列表 /floats
  • row['wk13QtyEXT']的内容是什么? row['wk12QtyEXT']? mean 需要一个充满数字的迭代。 row['wk13QtyEXT']+row['wk12QtyEXT'] 看起来像一个数字,而不是可迭代的。
  • 好吧,那为什么不直接使用mean呢?如果mean 那么statistics.mean([row['wk13QtyEXT'], row['wk12QtyEXT']])
  • 但有时,我在这行的内容中有 0 和 0。所以我之所以在函数中创建这条规则是为了返回“No mean”,因为 (0+0)/2 不存在。
  • 你的意思是直接做 sum/len ?

标签: python pandas average mean


【解决方案1】:

请找到更新的答案。

# Online Python compiler (interpreter) to run Python online.

import pandas as pd
import statistics
# Creating the dataframe 
df = pd.DataFrame({"A":[12, 4, 5, None, 1],
                   "B":[7, 2, 54, 3, None],
                   "C":[20, 16, 11, 3, 8],
                   "D":[14, 3, None, 2, 6]})
  
# skip the Na values while finding the mean
print(df.mean(axis = 1, skipna = True))
te = df.mean(axis = 1, skipna = True)
print(statistics.mean(te))

0 13.250000

1 6.250000 2 23.333333

3 2.666667

4 5.000000

dtype:float64

这是答案。 10.1

旧对话:

这是均值的语法。

# Importing the statistics module
import statistics
  
# list of positive integer numbers
data1 = [1, 3, 4, 5, 7, 9, 2]
  
x = statistics.mean(data1)

现在你做错了

return statistics.mean(row['wk13QtyEXT']+row['wk12QtyEXT']) is wrong
 
return statistics.mean(row) is right

【讨论】:

  • Im trying to get the average of the sum of the values from this selected range(columns). It is a dataframe, not a list. Im 不是试图计算一行的平均值,而是计算数据帧所有行的不同列之和的平均值。我应该怎么做?
【解决方案2】:

尝试(您的方法):

import statistics

def wkQtyEXTMean(row):
    if row['wk13QtyEXT'] == 0 and row['wk12QtyEXT'] == 0:
        return 'No mean'
    return statistics.mean([row['wk13QtyEXT'], row['wk12QtyEXT']])

combined_sf2['wkQtyEXTMean'] = combined_sf2.apply(wkQtyEXTMean, axis=1)

或者没有statistics.mean:

def wkQtyEXTMean(row):
    if row['wk13QtyEXT'] == 0 and row['wk12QtyEXT'] == 0:
        return 'No mean'
    return (row['wk13QtyEXT'] + row['wk12QtyEXT']) / 2

combined_sf2['wkQtyEXTMean'] = combined_sf2.apply(wkQtyEXTMean, axis=1)

或者没有apply:

combined_sf2['wkQtyEXTMean'] = (combined_sf2['wk13QtyEXT']
                                + combined_sf2['wk12QtyEXT']) / 2
combined_sf2.loc[combined_sf2['wk13QtyEXT'].eq(0)
                 & combined_sf2['wk12QtyEXT'].eq(0), 'wkQtyEXTMean'] = 'No mean'

或者如果你已经在使用 NumPy (np):

combined_sf2['wkQtyEXTMean'] = np.where(
            combined_sf2['wk13QtyEXT'].eq(0) & combined_sf2['wk12QtyEXT'].eq(0),
            'No mean',
            (combined_sf2['wk13QtyEXT'] + combined_sf2['wk12QtyEXT']) / 2
        )

【讨论】:

    猜你喜欢
    • 2023-01-16
    • 2013-07-12
    • 2013-08-02
    • 2012-06-21
    • 1970-01-01
    • 2017-09-28
    • 1970-01-01
    • 2017-09-11
    相关资源
    最近更新 更多