如何获取数据框列中每个值的前 n 个值和后 n 个值的平均值答案

【问题标题】：How to get average value of top n values and bottom n values for every value in a data frame column如何获取数据框列中每个值的前 n 个值和后 n 个值的平均值
【发布时间】：2020-08-24 22:22:35
【问题描述】：

最简单的解释方式应该是一个例子。

想象以下数据框：

我希望能够获得 b 列中每个值的前 3 个值和后 3 个值的平均值。所以它应该是这样的

感谢任何帮助

谢谢

【问题讨论】：

我不太明白，你的平均值是多少？哪个前 3 和哪个机器人 3？你怎么得到3.3、2.3和1.83？

标签： python pandas dataframe average

【解决方案1】：

这是我使用 numpy 帮助的解决方案：
（df 是您的示例数据框）

length = df.shape[0]   # Number of rows in the dataframe
windowSize = 3         # Since we are looking at top 3 and bottom 3 values 

for i in range(windowSize, length-windowSize):                   
    # Get the indexes (0-based) of the top 3 values 
    top3Idxs = np.arange(i - windowSize, i)
    bottom3Idxs = np.arange(i + 1, i + 1 + windowSize)
    
    # Get the values in column b at the proper indices
    top3Vals = df.b.to_numpy()[top3Idxs]
    bottom3Vals = df.b.to_numpy()[bottom3Idxs]
    
    # Find the average of the top3Vals and bottom3Vals
    avg = np.mean(np.concatenate((top3Vals, bottom3Vals)))
    
    # Set the average at the proper index in column c
    df.at[i, 'c'] = avg

【讨论】：

【解决方案2】：

我不太了解您的问题或您如何获得“c”列中的值。如果您想要两列的顶部和底部平均值，那将是 4 个单独的值（而您在“c”列中只有 3 个值）。我也不确定顶部/底部是否是指每列中的最高/最低 3 个值（因为你说的是 top 'n' values，我猜不是）。

col 'a' 和 col 'b' 的顶部/底部平均值是这样的：

data = {'a': list(range(1,10)), 'b': [5, 4, 2, 2, 4, 3, 2, 1, 0]}

    a   b
0   1   5
1   2   4
2   3   2
3   4   2
4   5   4
5   6   3
6   7   2
7   8   1
8   9   0

n = 3

averages = {}
for col in df.columns:
    averages[col+'_bottom_avg'] = df[col][:n].mean()
    averages[col+'_top_avg'] = df[col][-n:].mean()

Output:

averages
{'a_bottom_avg': 2.0,
 'a_top_avg': 8.0,
 'b_bottom_avg': 3.6666666666666665,
 'b_top_avg': 1.0}

如果您想要前 3 个最大值/最小值的平均值，您可以先对列进行排序：

averages = {}
for col in df.columns:
    averages[col+'_bottom_avg'] = df[col].sort_values()[:n].mean()
    averages[col+'_top_avg'] = df[col].sort_values()[-n:].mean()

Output:

averages
{'a_bottom_avg': 2.0,
 'a_top_avg': 8.0,
 'b_bottom_avg': 1.0,
 'b_top_avg': 4.333333333333333}

抱歉，如果我完全误解了您的问题。

【讨论】：